As announced (you heard it here first!) I talked at PyData Berlin 2014. You can find the slides at Speakerdeck and a video of the presentation on Youtube. The video seems to be rather popular on the PyData Youtube channel, but we’ll see if it’ll stand the test of time.
Recent Posts (all)
Lot of bookmarks this time:
- A command-line utility written in Go for batch-sending email.
- Ma (negative space))
- I’m always fascinated by the Japanese culture.
- Find out what’s keeping your Mac awake
- Sometimes selecting Sleep from the Apple menu doesn’t do anything. In that case, there’s a Terminal command that’ll tell you which processes are keeping your Mac awake. This is SO useful!
- What I Wish I Knew When Learning Haskell ( Stephen Diehl )
- A skimmable reference for intermediate level Haskell topics and an aggregate of the best external resources for diving into those subjects with more depth.
- A command line tool for showing the progress of long-running coreutil functions like
- You’re probably using the wrong dictionary « the jsomers.net blog
- Add the best English dictionary to your Mac.
- Online syntax highlighting for “MySQL”
- Vim Awesome
- Awesome Vim plugins from across the universe.
- Cloudmarks - Canisbos
- Cloudmarks (formerly Moofmarks) is a Safari extension that works with cloud bookmarking services Pinboard, Delicious, Kippt, and Google Bookmarks, letting you access your cloud bookmarks in a convenient popover.
- A web-based launchd.plist generator.
- 100+ Interesting Data Sets for Statistics
- Looking for interesting data sets? Here’s a list of more than 100 of the best stuff, from dolphin relationships to political campaign donations to death row prisoners.
- The utility contacts gives you access from the terminal to view and search all your records in the Address Book database.
- Shaping up with Angular.js
- Learn to build an application using Angular.js
- Syncthing replaces Dropbox and BitTorrent Sync by being open and decentralised. Runs on OS X, Windows, Linux, FreeBSD and Solaris.
- Vincent takes Python data structures and translates them into Vega visualization grammar. It allows for quick iteration of visualization designs via getters and setters on grammar elements, and outputs the final visualization to JSON. Perhaps most importantly, Vincent groks Pandas DataFrames and Series in an intuitive way.
I need to be on vacation in a house without internet connectivity1 to catch up with what is happening on the web. So two months after Maciej Cegłowski gave a talk at Beyond Tellerand in Düsseldorf, here am I, linking to it.
The talk is about the consequences of the internet on our lives. Maciej worked at Yahoo and runs a profitable business selling something (almost) all his competitors are giving away for free so he’s no stupid and mostly knows what he’s saying. As a bonus, his sense of humour is almost unmatched on the web (follow him if you don’t believe me).
If you’re short on time (like me), at least read it up to the second animal picture. But you’ll miss gems like:
You can dress up a bug and call it a feature. You can also put dog crap in the freezer and call it ice cream. But people can taste the difference.
- You may see this post appear on my site days (if not weeks) after I wrote it. [return]
I am thrilled to announce that I will speak this next July (25th and 26th, to be precise) at Pydata Berlin 2014, about Python and pandas as back end to real-time data driven applications. From the abstract of the talk:
For data, and data science, to be the fuel of the 21th century, data driven applications should not be confined to dashboards and static analyses. Instead they should be the driver of the organizations that own or generates the data. Most of these applications are web-based and require real-time access to the data. However, many Big Data analyses and tools are inherently batch-driven and not well suited for real-time and performance-critical connections with applications. Trade-offs become often inevitable, especially when mixing multiple tools and data sources. In this talk we will describe our journey to build a data driven application at a large Dutch financial institution. We will dive into the issues we faced, why we chose Python and pandas and what that meant for real-time data analysis (and agile development). Important points in the talk will be, among others, the handling of geographical data, the access to hundreds of millions of records as well as the real time analysis of millions of data points.
In case you’re in the Netherlands (or nearby! Hello Belgians), and are interested in big data and data science, software development, continuous integration or architecture, then you should come to Xebicon 2014. At €95 it is dirty cheap (you get good food too!) but if you’re a swell chap I have a promo code for you (just drop me a line and you’ll get the early bird price).
Some highlights from the sessions:
- The plenary is with André Kuipers, speaking about innovation; André is one of the first Dutch astronauts;
- Understanding natural language with 1500-year-old math: Jonatan, a colleague of mine, tries to classify the questions that you can type in a search box. He uses Stackoverflow data to train his classifier;
- iOS: Full speed ahead in your pants (Performance tuning on iOS): Jeroen will help the audience understand and analyze one of the most difficult, but most important, topic of mobile development: performance!
- Data: not as though as you think: Big data, Machine Learning, real-time processing and predictive analytics: reserved for the data caste or something for everybody? Listen to the CTO of GoDataDriven share his knowledge.
Depends on what you think about this snippet of code:1
let square y = y * y; limit = 100 in [(x, y, z) | y <- [1..limit], x <- [1..y], z <- [1..limit], square x + square y == square z]
- It finds all the right triangles with integer sides smaller than
limitwithout duplicates. If you want the version with duplicates, just use
x <- [1..limit][return]
- Learn Web Development From Scratch - SlideRule
- Udacity’s awesome course, CS 253 - Web Development, by Steve Huffman, forms the backbone of this course.
- Spiped is a utility for creating symmetrically encrypted and authenticated pipes between socket addresses, so that one may connect to one address and transparently have a connection established to another address. This is similar to ‘ssh -L’ functionality, but does not use SSH and requires a pre-shared symmetric key.
- Skala Color, a Mac color picker by Bjango
- Skala Color is a compact and feature-rich OS X color picker that works with a huge variety of formats, covering everything you’re likely to need for web, iOS, Android, and OS X development — Hex, CSS RGBA, CSS HSLA, UIColor, NSColor and more.
- tiimgreen/github-cheat-sheet · GitHub
- An extremely nice collection of tips for git and GitHub. I already had a bunch of convenience aliases set up, but I was missing
gits status -sb.
- Alexandru Cobuz wrote on April 10th, 2014 at 05:54:
- PredictionIO is an open source machine learning server for software developers to create predictive features, such as personalization, recommendation and content discovery. I will definitely check it out at the next Google Friday.
- A fast MySQL driver written in pure C/C++ for Python. Compatible with gevent through monkey patching.
- Bypassing “clang: error: unknown argument”
- This page describe how to bypass that nasty Clang 5.1 problem when compiling stuff in Mavericks. Note that, for fish users, the fix is using
set -x ARCHFLAGS -Wno-error=unused-command-line-argument-hard-error-in-future pip install whatever
- A curated list of free programming books hosted at GitHub.
- Data scientists need their own GitHub. Here are four of the best options
- Devs have GitHub. Now data scientists have more tools to help them work together.
- Quick tip: Best practices for rechargeable batteries - The Sweet Setup
- Some of the best (good?) practices for rechargeable batteries.
My first machine was some IBM 286, which a neighbour gave to my father when I was a kid. At work they were upgrading to, probably, 386’s so he could take it home. Since he already had one, we were the lucky ones. I think I was six, so this was probably 1990.
The machine ran DOS and some years passed by before we got got something that could run a GUI. But when the GUI finally came, it was a revolution for us. We began with Windows 95. As the years passed, Windows would get better and better; however of those years, I vividly remember one thing: all the time lost trying to troubleshoot and fix that crap. When, probably in 2003, I got an iPod I was surprised that this, small, highly technological music player would just… work. A friend at my university had a Powermac at home and he told me that Macs were all like that: they just worked. So I got a part time job and by next spring (the new Intel Macbook Pro’s were already out), I bought an iBook G4. It was a fantastic machine and I loved every inch of it (12, in diagonal). From the battery life to the trackpad to the integration between hardware and software. That day I said to myself that I would never go back to Microsoft products and PC’s in my life. At the Physics institute we were using LaTeX and all kind of scientific software anyway, so Office was never an issue.
In the years that ensued, I never looked back to that decision. Fed up with “Thanksgiving customer support”, or whatever it’s called here in Europe, I had all my relatives switch to Macs: my parents, my brother, my uncle, my in-laws (parents and brothers) and all the friends that ever asked me to help them with their Microsoft products. I lost count with the years, but up to before getting married I convinced some 25 persons to switch.
Fast forward 10 years later. I found myself using:1
- Google Apps for this domain;
- Google Apps at work, with Google Drive to sync work’s files;
- Dropbox to sync my personal files, used as a back-end of a handful of (iOS) apps;
- iWork for my Office needs (yeah, I’m not in academia anymore).
Something in these setup began to crack though. Google “coolness” began to fade. First they retired the mobile Exchange support for Gmail and then they sun setted their Google Apps free tier.2 I was grandfathered in their new plan, but I felt like an unwanted guest.3 At the same time, or around that time, Google initiated a Google plus-ification of their product line to push everybody onto their Google Plus wagon. They also initiated a Gmail redesign that resulted in worse UI and UX.4 The list doesn’t end here. Google sun setted one of my favorite service, Google Reader, effectively reducing the usefulness of a Google account.
At the same time Microsoft completely revamped their web mail offering, Outlook.com. They offer custom domain (up to 50 users), unlimited space, mobile Exchange and IMAP access, and redesigned their website in what I would consider a good way (although I still don’t use it).
So one weekend I switched my email from Google to Microsoft. It felt strange at the beginning, but I really enjoyed having mobile Exchange back and unlimited space.
Some months passed and Microsoft kept cooking what were, hear hear, actually really cool products. They announced a new, free to use, version of OneNote. They touted how awesome that was, its cross platform availability (I use a Mac an iPhone and an iPad, so being able to access all my notes from all my devices was kind of a big deal), and the ability to sync, for free, up to 7GB of notes through OneDrive.
I decided to give it a spin because, as a geek, I’m always searching for the ultimate productivity tool and even in academia OneNote was known to be a great piece of software. What I found is a well behaving and native Mac application, self-contained (in fact downloadable from the App Store) and overall nice to use. The iPad version is equally nice and the syncing and collaboration5 capabilities are also impressive.
Next in line, can you guess? One of the hottest startup in the Valley that welcomed Dr. Rice on the board? Up to a few months ago, it would have been impossible for me to abandon Dropbox, because so many apps I use rely on it as their sync/storage back end. This included 1Password, ScannerPro and Nebulous Notes, notably. But, recently, my 1Password apps freed themselves from Dropbox because I want to sync my Agile Keychain with my wife’s Macbook (through iCloud now). As for Nebulous Notes: I used it only to jot down quick notes, so OneNote was the ideal replacement for that.
That left ScannerPro: for those of you not familiar with it, ScannerPro is an amazing app that can upload scanned documents to Dropbox, Evernote, Google Drive and WebDav. I could have connected it to my company’s Google Drive but if you ever used Google Drive on the Mac, you know how much it sucks. I don’t know if it is specific to my setup, but once a day I would get the dreaded Google Drive needs to quit window that forced me to, well, quit it. As a result, I only use Google Drive through the web interface thus the GDrive route for ScannerPro was not completely satisfactory.
But some days ago the fine folks at doo just published Scanbot, an app similar to ScannerPro with the notable difference that it syncs to, among others OneDrive! Here’s my Dropbox replacement! And by using OneDrive for my files, every Office file on my OneDrive folder, can be opened, for free, by the Office web apps!
And with that Dropbox went to the Trash and another Microsoft app found its home on my Application folder.
After all of this happened, I asked myself why I gave “up” so easily on my existing setup. I don’t know the answer, but a couple of ideas comes to mind:
- the first reason can be probably ascribed to Justin Williams, of Second Gear fame; I don’t know exactly how it went, but probably after his acquisition of Glassboard, whose back end runs on Azure, he started tweeting and blogging about Microsoft. And he painted quite a different picture from the Microsoft I knew, in a positive way. He let me take a peak from a new angle;
- the second reason is that Microsoft has radically changed from the Microsoft I knew. At their latest developer conference, Build, they announced things which were imaginable 10 years ago: not only they open sourced a bunch of .NET components and related technologies, but they also showed on stage iPhones, iPads, were the host of The Talk Show by John Gruber6 and stopped with the notion that everything and everybody should run Windows for Microsoft to be happy.
That said: not everything is perfect in Microsoft land. Here’s a short list of what I don’t like:
- Outlook.com filters7 and keyboard shortcuts sucks!
- Adding email aliases to Outlook.com has to be done via an obscure command line app that only works on Windows (I had to download a Windows VM to make it work) and you’re limited to 5. Here Gmail is years ahead;
- OneDrive does not sync back files very easily; when I modify a file via an Office web app, it takes a while to get it back to my Mac;
- OneNote for Mac still lags severely behind its Windows counterpart and the iPhone apps is behind the iPad app.
Considered that I cannot deny that the new Microsoft is a welcome change for me; they have incredibly talented people and benefiting from their talent without having to use Windows PCs (or tablets) is a huge win for Apple users.
- This is not an exhaustive list of course. [return]
- This is not completely accurate: there still is some web page through which you can get a plan that supports only one account. [return]
- You may argue that paying could have solved it, but that’s another story. I certainly would if I’d make money off this site. [return]
- Although I don’t use the web interface for my mail. There’s an app for that. [return]
- I had a colleague download it and we immediately created a Knowledge Base of tips & tricks and tutorials for some of the software/technologies that we use. [return]
- If you don’t know who John Gruber is, let me tell you: it’s a big deal! [return]
- Only one condition per filter? Really? [return]
- Prune all those pesky branches that you have already merged, both locally and from remotes. On GitHub.
- Datalicious Notebookmania – My favorite 7 IPython Notebooks
- One of the most remarkable features of this year’s Strataconf was the almost universal use of IPython notebooks in presentations and tutorials. This framework not only allows the speakers to demons…
- Text File formats – ASCII Delimited Text – Not CSV or TAB delimited text
- ASCII delimited text solves the problems exporting and importing structured text files and is part of the design of the character set. Unfortunately a lot of people and systems use CSV and other printable delimiters such as tab that are broken by design.
- It’s showtime in a terminal near you! Put on your best colours, resize to 80 columns, and let your fingers fly!
A short list of links I’ve bookmarked this week
- This is why OneNote is awesome: Maybe a bit over the top, but there are truly a lot of reason to use OneNote as your note taking app;
- kandan: An Open Source Alternative to HipChat;
- CSS Diner: CSS Diner is a little game to help you learn CSS selectors. Type in the correct selector to complete each level. So cool!
- Practical partitioning: A nice introduction to (MySQL) partitioning (in PDF);
- Using GNU Stow to manage your dotfiles: How to manage the various configuration files in your GNU/Linux home directory (aka “dotfiles” like .bashrc) using GNU Stow. I’ve immediately started using this. I might, one day, blog about it.