What have you learned at school work today?

The theme of continuous learning is one that many people talk about – so much that it has become a bit of a cliche`. Not because people mention it all the time, but because often, people are just paying lip-service to that notion of continuous learning — they know it’s a good thing – an aspirational goal if you may – but they don’t follow through because it’s one of those things that are ideal for procrastination, because neither the need, nor the benefit, are immediate or pressing:

  • The need is not pressing, because you are learning something new for the sake of growing your skill and expertise, not something that is inherent in completing some task, so it’s not blocking you from achieving any short-term goals.
  • The benefit is not immediate because of the same reason – you’re learning something that is (most likely) not directly tied to a specific work item you’re currently working on. So while you may be able to use it soon, you probably won’t get immediate gratification. Instead, you’ll reap the benefit in some undetermined time in the future.

When faced with such a decision – put some time into learning something new that may benefit us in the future, or making progress on current work – it’s kind of obvious what usually wins. It’s just much easier to make the choice that gives us the short-term benefit at what feels like little- or no-cost.

My goal today is to remind you and me both, that the easy choice in this case is also the wrong one, and convince ourselves – and I mean ***really*** convince – that this is a crucial decision, and failure to do the right thing here has major impact on our career.

I’m learning on-the-job

Are you? Really?
My concern about learning on-the-job is that it revolves around the idea that our day-to-day job will present us with wonderful opportunities to learn fundamentally new things on a regular basis. In my experience, this does happen, but only in specific circumstances that are relatively rare:

  • Your first years – when you start out as a programmer – regardless if you’re a CS major, a self-taught programmer, or a boot-camp graduate – you usually have so much ground to cover, that many non-trivial tasks you work on would present you with real learning opportunities. Essentially, since you’re still building your basic skills, just practicing the craft is a major learning activity.
    • But you only get to go through that once.
    • Even during this time, how much you actively pursue learning activities, can greatly affect how much you can get out of this.
      • If you put an emphasis on learning, you’re more likely to take on tasks that will teach you new things, more likely to do then in a way that will teach you new things, and more likely to seek all the ways you can learn from it (feedback from others, going through related books/articles/courses, etc.).
  • A new workplace – when you start working in a new place, that usually includes learning some new programming language(s)/build and integration tools/problem domains/system architecture/… . Coupled with the fact that the first 90 days are usually more lax in terms of what is expected of you, this usually provides a good opportunity to learn some big new things.
    • For most of us, this only happens once every 3 or more years, so that’s leaves a lot to be desired.
  • Major new projects or shifts in tooling or product architecture in your current workplace
    • Similar to some of the opportunities you get when moving to a new place, but in a smaller scope.
    • These kinds of changes are (hopefully) not a common thing in you workplace, since too many major changes on a regular basis would make it very hard to be efficient in your day-to-day work.

Don’t get me wrong – it’s not that I think you’re not learning on a daily basis from doing your work – writing code, fixing bugs, reviewing others’ code, getting comments in code reviews, designing solutions for your product etc. are all activities that help you hone your skills. It’s just that they tend to be mostly tactical – meaning, they help you sharpen specific skills or knowledge you already possess, rather than teach you something completely new. As a result, these can contribute to minor improvements in the efficiency and quality of your work, but will most likely not cause you to do things in a substantially different way.

Even when some project requires you to learn a new technology or tool, since the top priority is delivering some work item, you would usually learn just enough to be able to do your job reasonably well, rather than diving head on and becoming an expert.

But hey, I may be wrong… maybe your experience is different, and you learn a lot on-the-job on a regular basis. It’s easy to figure that out, just ask yourself a few simple questions:

  • What did you learn on-the-job in the last month? In the last 6 months? The last year?
  • For each of those, how many do you feel you know well enough to consider yourself an expert on (or you think you will become an expert on in the near future)?

But I read blogs/HackerNews/Twitter/…

But not enough!

Reading blogs/HN/StackOverflow/etc., following links from tech figures on Twitter, reading your LinkedIn feed – these are all great and are part of what you should be doing on a regular basis. Yet they all tend to be rather “shallow”, since there’s only so much you can go into with just a few pages of text (much less 140 chars). So while these are great tools for getting to know what you don’t know, you are not very likely to become an expert on X just by reading a lot of SO questions or long HN threads about it. Following “tech news” and “tech social media” can play a very big supporting role, but the main learning requires focus and deliberate investment of time and energy towards learning something.

I have a personal life, you know

I’m very glad to hear that. You should! And you shouldn’t assume that learning should be done in your spare time (though you can learn in your spare time as well if you want to).

Your employer should be able to support you investing a couple of hours a week towards things that are not creating immediate value. They will benefit with a lot of value in the long term – after all, it’s not like you’re putting in the time towards learning how to play the bag-pipes – this is work related. If 3 hours a week seems problematic, go for bi-weekly. Anything less frequent than that, and it becomes ineffective.

If you’re reading this and you’re a manager, and you don’t allow you team to spend any time during their work day towards learning, you’re doing it wrong. You’re losing out on the opportunity of having a team that is much more satisfied with their work and is more capable (because it’s continuously learning). You should not only allow it, you should actively encourage them to do so: Ask them what they learned in the past 3-4 weeks, and make sure they have good answers; arrange day-time hackathons; make sure they have access to online courses (Lynda/PluralSight/etc.) and recommend some specific ones if needed. And if you think you can’t spare 3 hours a week per person, I urge you to read Are you too busy to improve? by @hakanforss.

On-the-job again

With a strong foundation of learning and growing, “on-the-job learning” can take on a much more important role – this is where you can find opportunities to put that newfound knowledge of yours to good use and practice it. Because no matter how much you read about something, go through tutorials, exercises, and pet projects, there’s nothing like applying what you learn to real world problems to help solidify it with that important extra dimension of applied (vs. theoretical) knowledge. Just remember there’s a fine balance here – you don’t want to make every task into an experimental project.


Stop reading this post and go learn something instead.

Last piece of advice

It makes it much easier to stick to your learning schedule if you have clear goals of what you want to learn and why. I also find it helpful to have check-points that would highlight any failure on my part keep up with my intentions. Both of these are easily addressed:

  1. Figure an ordered list of things you want to learn, pick the first one, and commit to learning the basics in the next 3 weeks.
  2. Set a recurring reminder in your calendar for every 4 weeks, to take 10 minutes to jot-down what new things you learned in the past 1-2 months. If you don’t have good answers to that question, you’ve probably been skipping class

Failing in Rust: The fun of a lifetime

And today – solving the ETL problem on Exercism.

After figuring out that the the problem description is somewhat misleading 1It suggests that the input format is a mapping of score–>collection-of-characters, and output should be character–>score, but in reality, the test suite is actually looking at mappings of score–>collection-of-string and transforming them to string–>score, it seemed like this should be pretty straightforward:

  • Iterate over the input and for each node, create a set of items – one for each String in the vector – mapping score–>string
    • Since we’re emitting multiple items per each node in the original input, we need to use flat_map
    • Internally, when iterating over the items in the vector, we can use the standard map function
  • Collect the emitted items into a BTreeMap

Ah, yes. Fun with iterators and references – one should always keep in mind what iterators they should use, and what that implies on their iterator methods, and specifically their input types. Fortunately, that’s very aptly and succinctly summarized in the std::iter documentation under “The three forms of iteration” – if you’re not very familiar with the three forms yet, I suggest you go read it right now. So here’s my first failed attempt:

And the corresponding errors:

What is happening here is that I’m using ‘iter()’, which iterates over references, yet I’m using the arguments as if they were non-references. Specifically, both ‘string’ and ‘score’ are references, so when I’m returning ‘(string, score)’ from the closure passed to ‘map’, I’m actually returning a tuple of type (&String, &i32), yet in the ‘collect’ call I’m trying to treat them as tuples of type (String, i32).
The error message from rustc can be a bit confusing, because it says that a reference is expected, and String is found instead, but from my POV, since we specified explicitly that we expect String, I’d expect it to say

Unfortunately, my POV is wrong. If you look closely, the error message clearly marks what the error pertains to (.collect…), so rustc is actually helpful enough to tell me exactly what to look at… that is, assuming I know how to properly read its error messages, which might be a bit of a far-fetched assumption in my case.

OK, we should be able to solve this in no time. I’ll just de-reference my references… oh wait, constructing the tuple from the dereferenced String will transfer ownership of that String to the tuple, which is bad because we don’t want to alter the input (also, we can’t, because it’s not mutable). OK, so I’ll just clone the string (i32 is copied over anyway), no big deal. Here’s my second version of the transform expression:

This should work, right? I’m returning the correct types, and just doing a (relatively) simple mapping. What could go wrong? Well apparently, Rust doesn’t like the fact that the closure I’m passing to “map” encloses the ‘score’ reference, and insists that the reference’s lifetime is shorter than that of the context in which it is used:

After staring at this error for a few minutes, I sort of realized that this is probably related to the fact that iterators in rust are lazy, which means that at the point that the call to flat_map returns, we haven’t actually done (almost) any work. It’s the call to “.collect” that actually iterates through the items and runs the transformation. I’m saying “sort of”, because I still couldn’t fully grasp what the actual problem is…
Initially, I thought that the problem is that ‘score’ exist within the scope of the flat_map call, and so when we return from that call it goes out of scope and reaches the end of its lifetime, yet we need to use it when doing the actual iteration. After some more mental parsing, I concluded that this model is incorrect – ‘score’ does not live within the scope of the call to flat_map; instead, it lives within the scope of the function used by flat_map when we iterate over it. But what does that mean exactly?
I tried to reason about this just by looking at my code, and figured out I’m getting nowhere fast2Well, not that fast, actually, so I decided that I need to dig into how flat_map() and map() actually work:

  • I started out with the documentation for std::iter, and the map and flat_map functions.
    • The documentation for std::iter didn’t teach me anything new per-se, but it did help solidify and reinforce some points. If you haven’t thoroughly read it yet, I recommend doing it – it’s time well spent (and it’s not that long anyway).
    • The documentation for both map and flat_map was very useful, because it introduced me to how the magic of lazy evaluation works for iterator adapters: Both adapters return a struct that implements the Iterator trait (std::iter::Map and std::Iter::FlatMap, respectively) and doing an iteration like

      we are actually iterating over an std::iter::Map, which means we’re calling the .next() method on that instance.
      This means that Map and FlatMap (and basically any other adapter) store the Iterator on which they were originally called, and call that Iterator’s methods as necessary. Such adapters would also store some extra data to allow them to actually perform the adaptation – more often than not, there will at least be a function3Strictly speaking, the iterators and adapter don’t have to store anything – but without storing the original iterator and some extra data, there aren’t many meaningful things it can do. Nevertheless, there are at least some cases where that is handy – e.g. the empty iterator
  • Equipped with that knowledge, I tried to think of how FlatMap works internally, in order to figure out how the two closures interact and what the lifetime of ‘score’ is. I started coming up with some ideas, but quickly realized that I’m going about it the wrong way – I can just look at the source code! To be completely honest, I knew this approach is probably an overkill, but at this point I couldn’t stop – I needed to know how things were actually implemented.
  • Let’s start with map() and it’s corresponding struct ‘Map’. This is pretty straight-forward to implement – it keeps the original iterator, as well as the mapping-function you pass to map(), and whenever next() is called, it just calls next() on the internal iterator and invokes the mapping function on the result.
  • How does FlatMap work? Every time it invokes the mapping-function, it gets back an iterator, rather than an item, and FlatMap is expected to return each of the items of this iterator in sequence4This is the “Flat” in “FlatMap”. This complicates things a bit:
    1. Let the original iterator be ‘It1’
    2. Calling It1.flat_map(Fn1) returns a FlatMap object – ‘FM’ – that holds both ‘It1’ and ‘Fn1’ as members
    3. When we call the FM.next() function:
      1. It basically calls Fn1(It1.next()), which returns an iterator, and stores that iterator as ‘It2′5This is simplified, since it needs to handle the case when mapping returns an empty iterator etc., but the general idea is still the same
      2. It then calls It2.next() and returns the result
      3. In subsequent calls to FM.next(), it would first try to retrieve items from It2, and once It2.next() returns null, it would refresh it by calling It2 = Fn1(It1.next())
      4. This is a bit simplified – we need to also handle edge cases and bi-directional iteration, but the general behavior is as described above. Read the source code for full details.

So, getting back to my lifetime error, what is actually happening here? Now that we have a better understanding of how the underlying mechanisms work we can get a better idea of what the lifetime of ‘score’ is, as well as the lifetime of the closure passed to map():

  1. The call to input.iter().flat_map(…) returns an object of type FlatMap, which takes ownership of input.iter(), and also of C1
  2. When we call collect() on the FlatMap object, it traverses the iterator (by calling next() repeatedly)
  3. When next() is called, FlatMap calls next() on the internal iterator (input.iter()), and then passes the result to C1, which in turn returns a Map object. This is where the problem lies:
    • The Map object takes ownership of strings.iter()
    • It also takes ownership of C2, and C2 stores a reference to ‘score’, which means the Map object stores a reference to ‘score’.
    • At the point when C1 returns, ‘score’ goes out of scope, yet the Map object is returned and lives on (FlatMap took ownership of it), so the reference stored by Map is basically a dangling reference (put differently, Map has a longer lifetime than one of its members)

Eventually, this doesn’t really require a deep understanding of Map/FlatMap, but rather the understanding that iterators are lazy, and the implication that has – specifically, that adapters return objects, and if those objects are returned from some function, then they cannot reference any local bindings from that function, because that would break the lifetime guarantees.

So what’s the right way to do this?
This works:

As does this one, which moves 6To be exact, in this case it copies, since score is &i32, and that means it implements the Copy trait, which causes values to be copied rather than moved ‘score’ to be owned by the map() closure – for more details, see Move Closures section in the Rust Book:

Looking back at this, it seems like this should have been easier to debug than it was – I actually spent almost a full hour digging into this (not including the time to write this post). Nothing like a bit of Rust to make me feel like a complete noob… But as always, it was educating, so time well spent.

Footnotes   [ + ]

1. It suggests that the input format is a mapping of score–>collection-of-characters, and output should be character–>score, but in reality, the test suite is actually looking at mappings of score–>collection-of-string and transforming them to string–>score
2. Well, not that fast, actually
3. Strictly speaking, the iterators and adapter don’t have to store anything – but without storing the original iterator and some extra data, there aren’t many meaningful things it can do. Nevertheless, there are at least some cases where that is handy – e.g. the empty iterator
4. This is the “Flat” in “FlatMap”
5. This is simplified, since it needs to handle the case when mapping returns an empty iterator etc., but the general idea is still the same
6. To be exact, in this case it copies, since score is &i32, and that means it implements the Copy trait, which causes values to be copied rather than moved

Failing in Rust: Creating a vector of functions

Another adventure in Rust compilation is here – this time, trying to solve the “Bob” Exercism problem.
This is a pretty simple problem to solve:

But then I thought – all those ‘if’ clauses are a bit unsightly; I can do away with them, if instead I create a list of (evaluator, reply) tuples, and then use iterators to find the first matching evaluator, and return the corresponding reply. This has the added benefit, of using my new favorite feature in Rust – Iterators (along with the unfortunate downside of being more complex, but I’m here to learn, so that’s immaterial).
So here’s my first attempt at creating and using a vector:

Which results in a long stream of errors… let’s look at the first one:

From the highlighted lines above, it appears that vec! infers the type of the “Evaluator” part of the tuple to be the concrete type of the specific closure defined at line 3 of my program. What I actually want is a generic function that takes in a string slice and returns a bool. So I changed the binding declaration to specify the concrete type of the vector:

Which retaliated with yet another barrage of errors, the first of which was:

Right – a trait is, by definition, not Sized, since it doesn’t represent a concrete object, and different objects that implement that trait may have different sizes. And why does this have to be Sized? Because std::collection::Vec requires its elements to be Sized, otherwise it would no longer support random access 1That’s probably more involved than just this, since Rust has quite a few restrictions about storing un-Sized data, but even that alone is a good enough reason why it can’t be supported in Vec, regardless of the other restrictions..
So, what do we do? Thanks to New Rustacean Ep. 05, I already know the answer to that – just Box it. So here’s my next attempt:

Things are starting to look better (compiler output fits in the console window without scrolling – Yay!):

Huh? Oh right, I’m stupid. A second look at the documentation of “map_or” shows that it takes the default value first, then the mapping function, not the other way around. Seems counter intuitive to me, but I might be odd (or maybe there’s something I’m missing). This works:

Is it prettier than my original solution? No.
Is it shorter? No.
Is it faster? I doubt it very much2It introduces heap allocations and derefrences, not to mention that it can’t be optimized by the compiler as well as the hard-coded decision tree that was there in the naïve solution..

But it was educating, so as Bob would probably put it “WORTH IT!”.

I just figured that Boxing may be a bit of an overkill here. Since the closures lifetime is the function body, we can just use basic references instead, and avoid the heap-allocation. This will yield a bit more verbose syntax:

It would have yielded nicer code if we could define the closures directly in the vector initialization, but we’ll be taking a reference to an item that has the lifetime of the vector initialization statement, which means we’ll have a reference that outlives its target.

Footnotes   [ + ]

1. That’s probably more involved than just this, since Rust has quite a few restrictions about storing un-Sized data, but even that alone is a good enough reason why it can’t be supported in Vec, regardless of the other restrictions.
2. It introduces heap allocations and derefrences, not to mention that it can’t be optimized by the compiler as well as the hard-coded decision tree that was there in the naïve solution.

Failing in Rust: Iterating over HashMap.keys()

This is the first of a series of posts I’m planning to write to share my experiences with learning Rust, and the failures that go along with it. Hopefully, my failures can help others in learning the language and maybe overcoming problems. Stay tuned…

As part of picking up Rust I’m going through the Exercism exercises, and just the other day, I was coding the solution to the raindrops exercise and hit the compilation failure described below, that baffled me at first. I dug into it, and figured out the issue, and thought I’ll share the details of the failure, what i found, and how.

For context, my approach to the solution was to create a HashMap, where the keys were the different factors, and the values were the strings associated with these factors. I would then iterate over the keys, filter out any that are not factors of the input, and then concatenate the strings from the remaining ones1as an aside: This solution has a bug. Namely, that there’s no guarantee about the iteration order of the keys. I changed my approach to use an array of tuples instead – you can see my final solution here.

Here’s my first attempt at this:

Simple enough, right? Not according to the Rust compiler:

How helpful, right? If you have some experience with Rust, then there’s a good chance you’ll be able to quickly spot the issue from this message, but for me this was pretty opaque. It was easy to figure out from the docs that core::ops::Rem is the trait implementing the behavior for the ‘%’ operator (“Rem” stands for “Remainder”), but I couldn’t figure out why the compiler is complaining that it’s not implemented for a u32 % u32…

So I took a step back and tried this:

Builds, runs, and prints 1 as expected.
But then, I figured that an iterator is returning a reference to the elements rather than the elements themselves… but that still seemed odd – you’d expect the operator to be able to work over items and references alike, right? Fortunately, it’s easy to check this out.

Still builds, runs, and prints 1. Duh.

Is it possible that my HashMap is holding something other than an integer type for the key, and that’s what causing the issue? If the key type is non-integer, then that would explain why we can’t use it in an arithmetic expression. So I decided to explicitly bind the generic types of my HashTable and see if that solves the issue:

Still fails, but now the error is a bit more verbose:

Aha! The type of ‘key’ is &&u32, rather than &u32. Looking back at the original error message, the data was actually there too, but I missed it – it complained that the trait for the type ‘&&_’ is not implemented rather than ‘&&u32’ in this last case, because the generic type wasn’t explicitly bound. I change my code to dereference ‘key’, and it started compiling:

But where is this double-reference issue coming from? Looking at the documentations of the std::collections::HashMap ‘keys’ function, it clearly says that it returns &K, not &&K:

The answer is the std::iter::Iterator ‘filter’ function – the predicate passed to ‘filter’ is of type

Meaning it takes a reference to the type of item returned by the iterator. Since the iterator is already returning references to the key types, the predicate ends up getting a double-reference. This, apparently, is a known issue, and the ‘filter’ function documentation warns about it – but I either never read that part, or just didn’t pay enough attention when I did:

Lessons learned:

  1. Pay close attention to the type details in compilation errors, and don’t get confused by catch-all type markings (_). More often than not, all the information you need is there in the error message – you just need to learn how to read it.
  2. Read the documentation. Closely.

Footnotes   [ + ]

1. as an aside: This solution has a bug. Namely, that there’s no guarantee about the iteration order of the keys. I changed my approach to use an array of tuples instead – you can see my final solution here

Into JSLand

For my recent Hackathon project, I needed to learn JavaScript (a.k.a. ECMAScript), because I wanted to use d3.js for visualization.

D3.JS powered line-graph

I figured it shouldn’t be too hard 1I was right, and set out to look for tutorials and resources… There are A LOT of those around, from generic ones to very specific ones, and also a vast amount of knowledge and answers on stackoverflow for addressing all those annoying little things you come up against once you start actually writing code – you know, all those things that would have been extremely easy if you had even rudimentary experience with the language, but that nevertheless you end up banging your head against for 10-15 minutes just to get out a string formatted a little differently.

While having so much information available is great, it can be overwhelming. Below is my summary of the resources that I found the most helpful in picking up JS, Node.js, and d3.js – obviously, this is very subjective, so your mileage may very:

  • JavaScript – My goal was to get a basic understanding of JavaScript, so that I can play around with d3.js to render graphs for time-series data. This is not a very high bar – it requires basic understanding of the JS syntax, and the ability to send a request to a web-server to retrieve the data for rendering.
    • Need to quickly learn all the basics just to read some snippets or maybe throw together some code? I found MDN’s “a re-introduction to JavaScript” very useful as a relatively short read that nevertheless introduces you to all the major features and syntax points. Of course, it won’t make you a JS expert and isn’t enough on its own, but it’s a great primer that can get you up and running quickly, assuming you have a few years of experience under your belt
    • For more in-depth, staying right there in MDN, I read through the JavaScript Guide. It goes deeper than the previous resource, but is still short enough that you can go through it in a few hours. The main added value for me was it’s deeper coverage of prototype based objects in the “Details of the object model” section, but the deeper coverage of syntax and behaviors was useful as well.
  • Node.js – I needed to be able to invoke .NET code that would do the actual data retrieval (pre-existing module), do some text manipulation of that data, and listen to HTTP requests from the client-side JS to serve it the data for rendering. This is all quite basic as web servers go – no real routing or complex logic, just a simple endpoint that either serves a static HTML file (if someone invokes GET for the root with no query string), or parses the query string, invokes the .NET code with the provided parameters, processes the response and sends back the results (in case there was a query string in the URI)
  • d3.js
    • I found the API reference to be very good in introducing concepts like selectors
    • d3.js has a lot of examples. Not only are there many types of visualizations available, for many of those (and definitely for the more common ones, like line-graphs) there are many different examples at different levels of complexity. While it’s great to have so many examples, it can be a bit overwhelming. Of the different examples and tutorials we’ve seen for line graphs with d3.js, we eventually settled down on these:
      • Simple d3.js line graph – we started off with this one, because was simple and didn’t involve too much code. We read the basics of d3.js, then read the code and used it do drive our learning of the concepts (meaning, we read the code, then went back to the API reference to gain a better understanding of the specific capabilities that were used). I found it to provide a good balance between being very simple on one hand, and not completely contrived and boring (Hello World anyone?) on the other hand.
      • Only problem with the simple d3.js line graph example… it yields a pretty miserable graph. We wanted a nicer graph – multi-line, with coloring, toggling of specific lines, and the ability to “scan” the graph with the mouse and get the values at the ‘x’ coordinate where the mouse is pointing. We found this example of a “Multi-line graph with automatic legend and toggling”that provides a detailed walk-through for how to create a graph with all of the features mentioned above except for “scanning”.
      • For inspiration on how to implement “scanning”, we used yet another example – “Interactive Line Graph”. This last one is an impressive example of the cool stuff you can do with d3.js3and JavaScript in general – it implements multi-line graphs, scanning, different scaling, continuous updates of the graph data, features4e.g. interactive updates of the graph data, which means it also spans quite a bit of code. It goes far beyond what we needed, so we didn’t delve into it much, and instead got leads on how to figure out the ‘x’ value from the mouse position over the graph.

Eventually, my two team mates and I managed to deliver a working application that shows a basic web-page to enter parameters (with 0-design – raw HTML buttons and text boxes), then takes these inputs and sends them to a node.js backend that, in turn, uses Edge.js to invoke .NET code that queries our back-end systems to get performance counters values. The server then sends the response back to the client, which renders it with our d3.js code, to yield graphs like the one you see at the beginning of the post. Great success!

Footnotes   [ + ]

1. I was right
2. In other words, I could have just skipped the other two and use only this one
3. and JavaScript in general
4. e.g. interactive updates of the graph data

And now, for something completely different

I’ve been thoroughly enjoying myself with taking the first steps in Rust, and am trying to put some more time in to play around with it, before I try my hand at any “real” project. Ideally, I would focus any learning time solely on this… But it’s not an ideal world. My team has a Hackathon this week 1I can’t complain about it too much, since I’m the one who organized it, and while I was hoping to do something in Rust, I am not yet proficient enough to do anything of real interest (probably in the next Hackathon).

Instead, I decided to work with a couple of people to create a dashboard for visualizing counter data from our services when we investigate issues related to our live-site performance. We do have an existing dashboard app that can create and manipulate graphs from this data, but while it has its merits, there’s a lot of room for improvement, and creating an alternative seems like a good opportunity to do something that may be useful in the long-run, as well as a good opportunity to learn something new – namely d3.js

So I have temporarily diverted my mental resources from Rust – which I’d categorize as *very* strongly types – to JavaScript – which I would categorize as the exact opposite. It’s an interesting journey.

Coming soon – a post about my adventures in JS-Land, and one covering the cool things I’ve seen in Rust so far… that is, if my head doesn’t explode first from trying to learn all these things at once.

Footnotes   [ + ]

1. I can’t complain about it too much, since I’m the one who organized it

Getting Rusty

So, I’ve decided to learn Rust…
I’ve been hearing a lot lately (in several different podcasts) about Go, Rust, and D, all in the context of system languages, and decided that I need to look further into these. Are they really viable replacements for C++? Or can I at least learn some interesting approaches? Even if not, just leaning something new is almost always interesting and a good mental exercise.
Based on the anecdotes I heard on the podcasts, and some very basic skimming of the high level details, I realized that:

  1. Go is Garbage Collected… Which means it’s very likely to have some performance limitations that would not make it a general viable replacement for C++ 1This is just my assumption based on previous experience with GC languages, and the general approach to GC in the C++ community. I realize this may be a wrong assumption, but I didn’t want to invest time to figure that out at the moment.
  2. D is… sort of garbage collected. It supports writing code without garbage collection, but its standard library requires garbage collection, and I did not want to pick up the endeavor of writing in a language without a standard library and brewing my own version of basic services at this point in time. The main upside I saw for ‘D’ was that it’s backed by Andrei Alexandrescu, who wrote one of the best C++ books out there.
  3. Rust seemed the most natural fit for what I was looking for – inspired (at least to some extent) by C++, but introducing new/different idioms and focusing on safety (via compile time validations), execution speed, and parallelization. Also, between D and Rust, the latter seems to have gained more adoption.

So, I have now started to read The Rust Book and going through some Exercism exercises… We’ll see how it goes from there.

Footnotes   [ + ]

1. This is just my assumption based on previous experience with GC languages, and the general approach to GC in the C++ community. I realize this may be a wrong assumption, but I didn’t want to invest time to figure that out at the moment

Hashtable Naïveté

or How std::unordered_map can kill your code’s performance (if you’re not careful)

If you’re in a hurry, and just trying to find a summary of a performance issue with std::unordered_map, you can jump directly to the Summary section

A few months ago, I found a performance issue in some C++ code I’ve written. The general premise of the code was:

  1. Get a list of n items
  2. Initialize a hash-table, and reserve(n)
  3. Iterate over a list of items and, for each one:
    • Extract a key based on the data in the item
    • Search that key in the hash-table
      • If the key didn’t exist, add it to the hash-table with some info (an enum value)
      • If the key did exist, use the value to determine how to process the current item in the iteration, and update the hash-table value if needed.

Since extracting the key, as well as the processing of the item were simple O(1) operations, I expected it to run in O(n) run-time complexity, given that hash-tables usually perform in O(1) amortized run-time complexity… but instead, in reality, the code was exhibiting O(n2) run-time performance.

An (extremely brief) explanation of how common hash-tables work

  • The hash-table maintains an array of “buckets”, each holding a linked-list of key-value items.
  • When inserting a key, it is mapped to a bucket, and then the key-value item is added to the linked list for that bucket.
  • When looking up a key, it is mapped to a bucket, and then the list for that bucket is scanned to see if it contains the key we are looking for
  • Ideally, mapping of the key to a bucket is done by the hash function. In reality, this is usually a two-step operation – the first step is using the hash function to map the key to a hash-value in a large space of integers, then the second step is to map the resulting hash-value to a bucket.
    • One reason for this two step process is when the hash-function can be provided by the user of the hash-table – as is the case for C++ std::unordered_map to allow using any type as the hash-table key. Instead of providing a hash function that takes the number of buckets as an input and returns a bucket ID, which would be considerably more complex, the provided hash-function just needs to map keys to a size_t hash-value (32/64 bits – depending on the target architecture), and mapping to a bucket is taken care of by the hash-table implementation in a common way.
    • One simple and straight-forward way to do the hash-value–>bucket mapping, is just to use a modulo operation:

      As we’ll see later, this obvious approach can be problematic

Taking a step back – Hash-tables provide O(1) access, right?

Wrong! This is one of my pet-peeves when conducting interviews. When asked (and sometimes, even when they aren’t asked), people often say that hash-tables provide O(1) access. This is not merely inaccurate, it’s just plain not true – when we talk about a run-time complexity of a hash-table being O(1), this:

  1. Refers to the amortized analysis of the hash-table run-time – in other words, it’s an analysis of the performance *in the average* case1In a well-dimensioned hash table, the average cost (number of instructions) for each lookup is independent of the number of elements stored in the table. Many hash table designs also allow arbitrary insertions and deletions of key-value pairs, at (amortized) constant average cost per operation.(from Wikipedia: Hash-Table) . An individual operation is not guaranteed to be O(1), and unless you size your hashtable to fit your needs during construction and completely avoid resizing, you are practically guaranteed to have some specifc instances where an insertion perform worse than O(1).
  2. Assumes a hashing function that results in no, or at least very few, collisions for your specific data. If the hash function produces a lot of collisions for your data, then down the drain goes one of the main assumptions that the hash-table idea relies on, and with it the performance of your code.

Enter suspect #1 – the hash function

If the hash function results in many collisions, then keys will be chained together under the same bucket, and the hash-table would “decay” into a linked-list, with O(n) access.

The code where this issue happened, was using a struct with two 32-bit data items as the key to the hash-table. Since there’s no implementation in the standard for a generic hash function for user-defined classes, we had to create a “hasher” class and provide it in the std::unorderd_map template instantiation – surely we have a bug in our Hasher object that introduces a lot of collisions and is causing this issue…

Here, and below, are the relevant code snippets from a repro project I’ve created to demonstrate the issue. The full code, including the supporting files needed to build it (project file etc.) are on GitHub.

But that though left as quickly as it came in – we were compiling the code for a 64-bit architecture, which meant the hash-function needs to generate 64-bit values – this made it very easy to generate a 64-bit hash-value that is guaranteed to be unique for every unique key – we just concatenated the two 32-bit items into a single 64-bit value. Reviewing the code and running it in a small test project confirmed that we had not messed up the hash values:

This was even more interesting, because a previous version of the code that just used returned ‘Datum1’ as the hash value was not showing the performance issue, even though Datum1 was known to have ~20% collisions. Which meant that even though we actually reduced the number of collisions, we saw major performance degradation… welcome to the twilight zone.

Enter suspect #2 – std::unordered_map

I worked on getting a repro of the issue in a small unit-test – setting my repro to evaluate a list of 10000 items, it took over 45 seconds (!!!) to run.. Once that was done, I could experiment with it easily to figure out what’s going on.

One of the experiments was to replace std::unordered_map with boost::unordered_map… and lo-and-behold, it took 0.025 seconds to run. That’s almost 2000 times better2x1800 to be precise! While I was happy I found a way to fix the performance issue, it seemed unlikely that std::unordered_map performance would be so atrocious in the general case. There had to be something else to it… It was time to pry open the unordered_map implementation of these two libraries to get a better idea of what’s going on.

Bucket list

The only reason I could think of that may explain the behavior I was seeing, was if all or most of my keys were being mapped to the same bucket in the hash-table. Given that all of my keys had a distinct hash-value, that seemed very unlikely, but I figured I’ll check, just to confirm my assumptions3…that if I have a unique hash for each key, then there will be no or few collisions… So I went hunting:

  • I loaded my weapon (good ol’ VS debugger)
  • I set and baited the trap (put a breakpoint in the hashing code – that way, I can “step-out” from there to where the hash-value calculation is being invoked, which I expected would be exactly where the bucketing is being done)
  • And then I hid and the bushes waiting for my prey (pressed F5)…
  • … and before long (just after the symbols finished loading), I had caught something!

Surely it was an odd and curious beast – just look at function ‘_Hashval’ in file ‘xhash’ included in the VC include dir (on my machine this was ‘Program Files (x86)\Microsoft Visual Studio 12.0\VC\include\xhash’ – but then I ran out of hunting metaphors.

The function is getting the bucket ID by taking the hash value (computed by the hasher) and masking it with a bitmask that is ‘2n-1′ 4i.e. a bitmask with the first n bits set to 1, which gives an effect very much like doing a modulo operation with the number of buckets. This means, the bucket ID is set based on the first n least-significant bits of the hash value, and only those bits.

And that is the crux of the problem. Mapping a hash value to a bucket using a modulo operation is a naïve approach, as it fails to use the full hash value. Dependent on your hashing function, the impact might be small or non-existent, or it could be very big. In my specific case, that meant that one of the data items that were used to generate the hash (namely, Datum1) was completely ignored, and while my hash value was indeed unique per key when looking at it in it’s full 64-bit glory, when reduced to just the first 32 bits or less – which is what std::unordered_map does – it ended up with almost all the keys colliding. This meant that all keys were mapped to the same hash bucket, and what we end up is a bucket with a linked-list with length ‘n’, which yields an O(n) access.

Contrast that with the mix64_policy of boost::unordered_map, which is used for mapping the hash value to a bucket for most types. That policy is intended to use the full value of the hash for determining the bucket (although I haven’t spent much time understanding the specifics, nor am I sure I’ll get it if I did), which explains why boost::unordered_map did not exhibit the pref issues seen with the VC version.

Pecking order

This also meant that altering my hashing function to concatenate the two 32-bit values in the opposite order also solved the issue, since the more diverse Datum2 item was used for determining the bucket, while Datum1, which was the same for most keys, was ignored.

Surprisingly, making this change in the ordering of the hash value made std::unordered_map (very slightly) out-perform boost::unordered_map when run with real data from my product. Since the difference was pretty small I haven’t invested the time in profiling it, but my assumption was that this is due to the extra cycles required to apply the mix64 policy in boost, as compared to the very cheap computation needed in the VC implementation.

Despite this specific outcome, in my unit-test benchmark for the different hash-table implementations, boost consistently performs ~3x faster than VC std, even when using the hasher with the “correct” ordering that doesn’t trigger the issue. This is, by far, not a comprehensive bench-mark, but something to take into consideration and test/profile if your code is extremely sensitive to performance aspects.


So, what did we learn today?

  1. VC++’s std::unordered_map maps hash-values (as generated by the Hasher class, provided as a template parameter) to buckets using a modulo like operation, which basically means only the first ‘n’ bits of the hash value are used
    • Where 2n is the number of buckets
    • Which means ‘n’ is probably going to be pretty small – probably less than 16
  2. Beware when writing your own hash-function (e.g. when using a custom type as a hash-table key)
    • If using std::unordered_map, be aware that its hash-value –> bucket mapping is only using the first ‘n’ bits of the hash value () – this means that composing a hash-value by concatenating several numeric values into a 64-bit value is suspect to only some of these values being used.
    • Scenario in point – if you construct the hash value by concatenating two 32-bit values, only the one at the lowest-order bits will be used, and if that value is common among many keys, then you’re going to have a lot of collisions, even if the actual hash values are different for those keys.
  3. boost::unorderd_map does not have the same issue, since their hash-value –> bucket mapping is using Science to make sure all bits of the hash value are used in determining the bucket
  4. Example project demonstrating the issue is available at https://github.com/hershi/Sandbox/tree/master/StdUnorderedMapPerfIssue

Footnotes   [ + ]

1. In a well-dimensioned hash table, the average cost (number of instructions) for each lookup is independent of the number of elements stored in the table. Many hash table designs also allow arbitrary insertions and deletions of key-value pairs, at (amortized) constant average cost per operation.(from Wikipedia: Hash-Table)
2. x1800 to be precise
3. …that if I have a unique hash for each key, then there will be no or few collisions
4. i.e. a bitmask with the first n bits set to 1

Best Case Scenario

Every once in a while I come across a story that really baffles me. The kind of story where someone does something that seems so utterly dumb that you cannot find any plausible reason for why they would think what they’re about to do is a good idea.

Now, everybody can make surprisingly bad choices under pressure… Pop quiz hot shot, your on the highway and your gas pedal gets stuck; what do you do? WHAT DO YOU DO? Believe it or not, some people answered “reach down and try to pull it” == take your eyes off the road when driving at a high speed!1In case you don’t care to read through the linked page – the right thing to do when your gas pedal gets stuck is put the car in “neutral” gear I’m not talking about those cases. I’m talking about cases where the person has time to think through their actions and make a deliberate choice, and yet the chioce they come up with is one that makes you wonder if they are recovering from a recent stroke, or maybe they’ve always been running on a lean mixture.

Case in point (based on a true story) – Jill is a development lead, and one morning she come to work and finds an email message in her inbox from her product manager counterpart – let’s call him Jerk Jack. The email is addressed to Jack’s boss and Jill’s boss, with Jill in the CC line, and consists of what is basically a rant about how he’s been asking Jill and her team to provide him with the schedule plan for the development of a new set of features for a while now, but has yet to receive them.
This story was told to me by a friend some time ago, and my first reaction was this


My friend and I went on to discuss this in some length – we both agreed that has been a very stupid move on Jack’s part:

  • It wouldn’t really help much in getting him the details he needs – the dev lead didn’t have the details he was asking for, and escalating the issue won’t change that fact, regardless of whether she should have had that information or not (more on escalations, below).
  • It severly damaged his relationship with the dev team, which is the most important relationship for him to be successful in his role2I mean successful in the sense of shipping a great product, office politics aside. If you’re working at a place where office politics go against the good of the product and the team, the solution is not to play by those rules, but rather to find a good place to work at.. Restoring trust between him and the dev lead would be hard, if at all possible.

Jill also made some mistakes that contributed to this sorry outcome – she and her team have been working on the plan Jack asked for, and already had it mostly ready, but needed to review it with their manager first… But they didn’t keep Jack in the loop, so he wasn’t aware this was just around the corner. The way Jack handled the situation also implies that the relationship between Jill and him wasn’t that great to begin with, which is something that both sides are responsible for. Jill probably didn’t know or understand how important the plan was for her peer, or has not taken the right steps to learn about and address his concerns before they became a full-blown problem.

When thinking back on this discussion, I started to wonder – when I heard this story, I had a near automatic head slap reaction, so it was quite obvious to me that sending that email was a bad idea… how come it wasn’t as obvious to Jack himself?
What was it exactly that made it so obviously wrong? I think to best way to describe it is that there was no best case scenario. Any realistic outcome or response to that email that I can think of, would be undesirable for Jack. It’s just not clear what exactly he was trying to achieve:

  • Scenario 1: The dev team already has a plan that they can share
    In this case, Jack will probably get the information he wanted quickly after sending the email… but at what cost? Any future work he needs to do with the dev team is going to be hampered by mistrust and lack of goodwill, and it’s practically guaranteed that the team will not go out of its way to make Jack’s life easier or make him successful. The whole thing has now become personal. In this scenario the team already had something they could share, so it’s reasonable to assume that if he would have just gone ahead and asked them for the information directly, he would get it.
  • Scenario 2: The team doesn’t have the plan ready yet
    In this case, sending an email to the bosses isn’t going to magically change anything. The team would need to complete their planning before they can provide Jack with the infromation, because they can’t share information they don’t have.
    The best that Jack could hope for is that the bosses will pressure the team to provide the answer sooner, by increasing the priority of the task and/or working harder. This is a completely different discussion altogether – it’s about priorities and Jill is the right person to discuss those priorities with, and if Jack has good reasons to ask the team to increase the priority of the work he needs then there’s a good chance the team would do that gladly and even throw in the extra hours since they understand the importance. Even if Jill and Jack could not get to agree on the priorities, that would have set things up for a more effective and amicable escalation – jointly asking for the managers’ help in aligning the priorities, rather than finger pointing.
    Instead, by applying pressure through their manager, it ends up being shoved down their throats…3Which, by the way, is a great way to get the best people in your team to quit That is, assuming the dev manager plays along with Jack. If the dev manager decides to back his team instead , then this ends up being even worse for Jack – the plan would still be delivered at the same time that it would otherwise, but the relationship with the dev team still suffers the damage.

So basically, the best case scenario is that he gets the information a bit earlier than otherwise, at a big cost. At the same time, going and having a very serious talk directly with the dev lead would have yielded a result with the same upsides but without the downsides. There is no best-case scenario – all the outcomes of sending this email are bad. So, again, how couldn’t Jack see this?

The key difference between you, hearing the story as a third party, and Jack, being one of the protagonists, is the emotional context. For Jack there was stress involved since this was blocking him from making progress on one of his primary goals. He was also quite likely angry with the dev team for putting him in this position. So my guess is that he was viewing his actions from the angle of “how can I put pressure on the team to deliver the goods?”, rather than “what’s the best way for me to get the plan as fast as possible?”. If jack were to ask himself “what’s the best case scenario I could hope for when sending this email?”, maybe he’d think again before pressing the “Send” button.

One type of situation in which I wish more people would ask themselves the best case scenario question is when escalating issues. Escalations are a useful tool when working in a big organization – it is effective in resolving issues where agreement cannot be reached, especially when:

  1. The root cause is contraditing goals or expectations between the involved parties, or…
  2. In cases where there’s no “correct answer”, and the discussion has reached an impasse, requiring someone to make a judgement call

The thing is, it’s useful for little else. As such, it’s a great tool, but one that you should very rarely have a need for. Nevertheless, I keep seeing people use it inappropriately, which almost always ends up with bad results for everyone – and much like Jack’s case, ,it’s unclear how anyone thought it’s a good idea.
Even in cases where escelating is appropriate, it is much more effective when the parties get to a mutual understanding that they need input from a higher authority and they escalate together, rather than escalations that are done in a single-sided manner, and yet most of the escalation I see are done single sided.

Asking “what am I trying achieve?” before escalating an issue, helps focusing the escalation on the actual decision that needs to be made:

  • Are we having an issue reaching an agreement because our teams have different priorities? We’re escalating because our managers have the high-level view (and the authority) needed to adjust the priorities and eliminate the conflict.
  • Are we trying to get a decision between two options, where there’s no objectively better option? Have we been going on in circles, and cannot convince each other to agree on an approach? We’re escalating so that our managers can make a call (the rules are that we need to stick with whatever decision they make
    • Keep in mind though, that it’s always preferable if we can work it out ourselves – is this such an important point for both of us? Is the other option so much worse? Can one of us disagree and commit?
  • Am I trying to influence you to make a change in an area you own? Suggesting what I think is a better solution is OK, and trying to convince you is also OK, but I need to come to terms that you own the decision, rather than escalate to our managers (even if I think they’ll back me up)… that is, unless I think what you’re doing is going to have an exceptionally big impact and basically “drive us over a cliff”4Hint: The wording of a form in our application probably does not have an exceptionally big impact.
  • You’ve been working with me for a while, and you think I’m an idiot? Let me know, rather than going to my manager! Maybe I can get better, or change some of the things that bother you? Or maybe there are some hidden assumptions or miscommunication that are preventing us from working well together, which can be hashed out through an honest talk? If you escalate instead of talking to me directly, that means I’ll get the feedback through a third party, which makes it very likely that the feedback you intend to give, and what I end up receiving are different. Also, I’m much less likely to take the feedback well.
    • OK, you’ve given me feedback multiple times, but nothing changed? Consider letting me know that you’re going to talk to my manager – that may or may not be the right thing, depending on the people involved and the specific circumstances – but whether you do that or not, remember that giving feedback in public is rarely a good idea. Communicate with my manager in private, and preferably not in email.
      • … feedback doesn’t seem to work? You think I’m beyond redemption? That does entail talking to my manager… but the goal here is different. You’re trying to get me out of the way (read: fired) – that’s not necessarily a bad thing (I mean in terms of the greater good. Obviously, it’s a bad thing for me). These are uncommon circumstances5I hope. If not, you should consider moving to a different workplace, and require a completely different approach6at a minimum, I’d suggest in-person discussion rather than email

So my genuine plea to anyone reading this – just before you hit that ‘Send’ button, ask yourself “What’s the best case scenario?”

Footnotes   [ + ]

1. In case you don’t care to read through the linked page – the right thing to do when your gas pedal gets stuck is put the car in “neutral” gear
2. I mean successful in the sense of shipping a great product, office politics aside. If you’re working at a place where office politics go against the good of the product and the team, the solution is not to play by those rules, but rather to find a good place to work at.
3. Which, by the way, is a great way to get the best people in your team to quit
4. Hint: The wording of a form in our application probably does not have an exceptionally big impact
5. I hope. If not, you should consider moving to a different workplace
6. at a minimum, I’d suggest in-person discussion rather than email

Interesting read: The Google Linear Optimization Pack

From the Google Developers Blog: Sudoku, Linear Optimization, and the Ten Cent Diet

Two specific points that I found very interesting with this post:

  1. Google is exposing this not only as an API (developer friendly), but also as an add-on to their spreadsheet – it follows the trend towards exposing tools that used to be available only to developers to a broader crowd of people who are tech-savvy yet not professional developers.
  2. This anecdote exemplifying what a miraculous thing computing power is, in the sense that questions that were nearly impossible to answer in the past [due to computational workload] are often trivial to answer nowadays… with less specialized knowledge, higher accuracy, and often virtually for free

“In 1947, Jack Laderman used Simplex, nine calculator-wielding clerks, and 120 person-days to arrive at the optimal solution.

Glop’s Simplex implementation solves the problem in 300 milliseconds.”

That is 3 years (computer assisted, since they used calculators) to 300 msec