Profile in Computational Imagination: Dr. Jon Krohn

Charting a Path from Neuroscience to Start-up Data Science

headshot of jon krohn

I came across Dr. Jon Krohn (Doctorate in Neuroscience from the University of Oxford) through an algorithm on LinkedIn that suggested I might know him. I did not, but his professional journey intrigued me and I asked Jon to participate in this interview. Read on to learn how a neuroscientist becomes a commodities trader and multi-industry data scientist who dreams of one day measuring happiness directly.


M: Welcome Jon. Why don't you start us off with describing the arc of your professional pathway? What intrigued me is how you made your way into these various industries and what connects the dots.

J: Throughout my graduate work, both Masters and PhD, my trajectory was always on commercializing science and technology, particularly via entrepreneurship. I held a senior role in the Oxford Entrepreneurs student society for several years, including founding an incubator for early-stage tech start-ups. Those did well: All the tenants received substantial seed funding and one of them, Plink, was the first acquisition of a British company by Google.

The particular opportunities that I followed were all based on choosing to work with people that I found exciting to speak to, people that I perceived as intelligent, and from whom I could learn a lot. While I was doing the PhD, I became involved in an academically fruitful partnership with machine learning specialists at the University of Edinburgh. I spent a fair bit of time helping commercialize that research as a computational method for screening genomic results that would enable pharmaceutical companies to identify candidate drug targets and compounds. While pitching for that start-up in New York, a friend suggested I meet with contacts of his who were starting a statistics-driven hedge fund in midtown Manhattan. They were sharp and leveraging data to trade in exciting ways, so I joined them for a couple of great years working between New York and Singapore.

Hedge Fund

M: While working with the hedge fund what kind of work were you doing? Building software, trading models, trading or some combination?

J: It was both. After a couple of months there, I was actually trading. Starting off in quieter hours, I was using our proprietary software to execute strategies live. After three months, I had the opportunity to relocate to Singapore where I had my own book to trade. During the highest-volume hours, I would be focused entirely on trading. The algorithms we deployed didn't require much human involvement but you always wanted to keep a close eye on them.

hedge fund trading room with Jon
Electronic trading office at Bryant Park, Manhattan

M: Without giving away any secrets, at the algorithm level were you drawing on your background in machine learning, classical time series analysis or doing high-frequency trading relying on speed?

J: In general, both traditional time series analysis and machine learning are useful for analyzing financial data and economic indicators. These analyses can inform different strategies across a broad range of latencies.

Advertising Research

M: Let's move on, so you ended up at Annalect a subsidiary of Omnicom, tell us a bit about that.

J: That's right. That opportunity arose organically, like the others before it. I was looking for something more tangible, something where you were impacting people broadly as opposed to just investors. By chance, I met a woman who was an executive at Omnicom. We met at a talk and she suggested that I speak to the CEO of Annalect, which is the data and analytics backbone of Omnicom. I really enjoyed meeting Scott Hagedorn and his team there. That work and the people were fantastic; I had no intention of leaving. Starting with Oxford Entrepreneurs, however, I had been drawn toward the entrepreneurial route for a long time.

annalect wins an award
The Omnicom subsidiary Annalect being awarded 'Smart Data Agency' of the Year as well as winning the Data Science Hackathon at I-COM Global, held in San Sebastian in May 2015

I met Ed Donner, the CEO of untapt, at an Oxford alumni event in New York ages ago. He's another one of those intelligent people with countless exciting ideas. So, after eighteen months at Omnicom, I started contemplating making the dive into the deep end of entrepreneurship, joining the startup as its Chief Data Scientist in August of this year.


M: Tell us a bit more about untapt.

J: We're quickly growing a digital platform for matching software engineers with opportunities in the financial technology sector. We already have openings with dozens of firms on three continents -- North America, Europe and Asia -- and we envision our scope gradually growing beyond engineers to other technically-skilled positions, as well as growing beyond the FinTech industry. I think we're well-positioned for this, and I have objective reason to think so: We recently were recognized as "Most Likely to Grow Exponentially", from a field of thirty young firms, at a competition held by Amazon Web Services at their loft in SoHo. As an award, Amazon gave us $10k worth of AWS credits, which have come in handy!

untapt celebrates
untapt at a start-up pitching event at the Amazon Web Services Pop-Up Loft in SoHo, Manhattan in August 2015, where they were voted 'Most Likely to Grow Exponentially'

M: What is the current focus of your own work as you take on the role of Chief Data Scientist?

J: It's early days, but my work with untapt is focused on two primary streams. First, I'm optimising our back-end algorithm so that people curious about the opportunities out there for them get matched with the best possible ones for them. It's somewhat analogous to a dating website in that respect. Second, I write code that produces internal reports by which we can monitor our performance on key metrics. As Michael Bloomberg has said, "If you can't measure it, you can't manage it."

M: How do you see these opportunities tied together? Just curiosity and wanting to work with smart people? Is that the theme that I hear?

J: For me yeah! It is the opportunity to deploy cutting-edge technology and analytics with people that are great to spend your day around, why settle for anything else?

Analytical Commonalities

M: What are the strongest analytical commonalities across these various industries?

J: All my academic research involved leveraging computational statistics to identify meaningful signal among the gigabyte-scale haystacks of genomic and brain imaging research. So all that work, sitting at a terminal getting comfortable with a Unix command line, R, Matlab, SQL, Perl, running highly parallel jobs on multiple servers, formed a foundation for the industry work that I have done since. In addition, the statistical models deployed throughout my doctorate formed a basis for models that I have deployed since. That covers the kinds of techniques. And, though I haven't applied them much yet I am excited to get increasingly involved neural net models like deep learning that have theory rooted in our real understanding of neural connectivity and learning in animals.

Sparse Data

M: Looking across some of your publications, you have worked on techniques for dealing with highly dimensional sparse data. I know it seems weird in the day of Big Data to talk about sparsity but share some of your thoughts on dealing with sparsity.

J: I think it is interesting that the more data we have the more likely it is to be sparse. Think about the Netflix Prize data set. That is the same kind of example. You have tens of thousands of films rated by hundreds of thousands or millions of users, and any time a user had rated a film you get some kind of a rating in that case. So, a user might have rated dozens of films but out of tens of thousands that generates this enormous sparse data set. I think that is what a lot of industries like. Any time you have a lot of parameters and not a lot of data relative to the number of parameters you need to be thinking about Bayesian techniques. So Bayesian techniques with some well-thought-out priors are great candidates for the methodologies you're going to need to crack those problems. Traditional sequential (frequentist) statistics will break down.

M: Are you seeing much of use of Hadoop (map/reduce) for parallel processing in your work?

J: I have used Hadoop but it is not yet something I use every day though it is definitely increasing in importance. If you are working with terabyte scale datasets, the ability to deploy something like Hive, Pig or Spark is critical.

Advice for recent PhD Graduates Heading for Industry

M: If another recent PhD graduate is thinking about trying a similar professional route to your own what advice would you give them?

J: Spending as much time as you can doing analysis with the largest-scale data you can get your hands on, as opposed to just developing laboratory skills, is key. And, there are lots of open source data out there, particularly at curated repositories like If you can find a way to tie that into your research, that is fantastic. Get some experience with the commercial world through internships in industry or getting involved with commercially-minded student societies. Take a leadership or organizational role in activities and present your research to lab-mates or at conferences to help you get comfortable communicating your techniques and findings to a broader audience. Take the time now while you can; there is so much flexibility when you are a student to learn hard data science skills.

M: How do you see the emerging focus on "data science" in relationship to more traditional fields such as statistics, computer science, operations research, computational linguistics, and various streams within the artificial intelligence field?

J: That is a great list of fields, particularly statistics and computer science, but all of those. Data science is the merger of them and while a data scientist will likely never be as skilled as a true expert in any one of those fields, she can develop sufficient understanding in all of them to be confident in deploying the aspects that she needs and hack them together into a commercial-grade solution.

Imposter Syndrome

M: One of the things that I wonder about is data scientists struggling with "Imposter Syndrome." I know I do because I might use a technique like neural networks as an alternative to traditional statistics for predicting something, but then I think "I am not trained as a computer scientist and there are thousands of people in the world who understand this technique better than me." Data Science is so inter-disciplinary...

J: Yea absolutely. The nice thing about Data Science is that you can test things out and kind of convince yourself that it has worked. You can cross-validate and create charts that convince yourself. But yes, you are absolutely right, there is this underlying fear that someone who is truly expert at some aspect of artificial intelligence is going to ask you a question that leaves you feeling like you have no idea what you are talking about.

M: Good to hear I am not the only one who feels that way sometimes... You fairly recently came out of neuroscience. What are some things that are obvious to people inside the field that I as a curious outsider might not realize yet?

Neuroscience Developments

J: The rapidly dropping cost of genomic sequencing should provide opportunity to increasingly uncover the molecular underpinnings of neurological disease and even psychiatric ones which are trickier. The psychiatric ones are amongst the most debilitating issues globally in terms of the lost years of high quality of life so hopefully we make serious progress there. I can't say for sure that a decade will be enough time for these advances to translate into meaningful treatment but I certainly hope so.

Outside of that, for me one of the big mysteries is the hard problem of consciousness, which we appear no closer to resolving despite incredible efforts from people like Francis Crick, co-discoverer of the genomic helix structure, as well as Christof Koch and a number of other great minds. I would be really impressed if we could make strides in understanding the basis of consciousness. It is a mind boggling thing to think about, and we don't understand it at all. We all experience it and yet it cannot be defined in any rigorous scientific manner.

M: So the debates if I recall, are between those who believe a mechanistic approach is the answer, if we simulate the brain at a sufficiently complex level consciousness will emerge while others contend that this approach won't get us there.

J: Exactly, but even if we create a machine that says "Hey I am conscious!" You still don't know any more than you know that your best friend is conscious.

Measuring Happiness: Neurologically

M: So Jon, what do you want to measure that you can't yet?

J: The kind of thing that would make a data scientist feel valuable would be to help people with their long-term, sustainable happiness. That is supposed to be the goal of society in some ways. Happiness is really tricky to measure. We are pretty much dependent on people's self-reports which are inaccurate, highly-biased, and the data are inherently subjective. But certainly that kind of area; being able to convert the material prosperity that we have today, at least in parts of the world, into people's actual day-to-day contentment.

M: So is the measurement emotional states? If it isn't self-report, is it a neuro-scientific definition of happiness?

J: Yea, I guess if somehow you could reliably quantify that and have it become a focus of research and capital, it could be really rewarding.

Data Science Team Composition

M: Shifting topics, tell me about your ideal cross-disciplinary data science team.

J: Probably the most important counter-part to the data scientist is a good [software] engineer. Data scientists seldom come from a hard computer science background. Although we try our best to be good programmers, having great programmers to work with is essential. I think that pairing is the most important relationship of a good data science team. Some of the other things, of course great sales and marketing people that can convey what you are doing so that the data scientist doesn't have to be involved with client meetings all the time. On the one side, having people who can get into the nuts and bolts better than you and on the other hand having people more specialized in the business side of things is great.

Tool Talk

M: Talk to us about your current analytical tool stack.

J: R is still my bread and butter and I think it's a place where most new statistical packages in the industry are born. Python is really important in this space for transitioning analytical work into a production environment. So R and Python are both really important today. Of course being able to query structured and unstructured data and tools like MongoDB are key. And, if you are getting into big data sets being able to deploy things like Spark and Hive are critical as well.

M: Spark seems like it will take off, gathering momentum.

J: I agree, totally.

M: Have you looked into Julia?

J: I know what it is and looked at some speed comparisons to Python but that is about it.

M: I played around with it but decided it was too early for me.

J: There are so many things that I want to get better at, for example working on my Python technique or SQL. I worked closely with Derk Landes at Annalect. The things he could do with SQL blew my mind. Derk could write Bayesian statistical algorithms from scratch in SQL. He wanted to learn Bayesian inference, and since SQL is what he knew best he just wrote it all in PostgreSQL. So there are things like that that I want to improve at before I get into Julia. Another area where I think there is huge depth of potential for the kind of problems I address is learning much more about machine/deep learning approaches.

M: What intrigues you professionally these days?

J: There is a package that I am really interested in playing around with called Stan. It comes out of Andy Gelman's lab at Columbia. A friend of mine, Rob Trangucci is developing the R implementation of this. It is a super cool package; it's own language. It is a way of thinking about building Bayesian models, in particular Bayesian hierarchical models, where you have some attributes nested within others. The prototypical example is to model a classroom nested within a school, nested within a board of education for some kind of national educational outcomes study. If you are doing any kind of Bayesian analytics it is definitely worth taking a look at.

At a more overarching level, what intrigues me professionally is leveraging statistical and computational techniques to improve people's quality of life, their day to day satisfaction, peace and contentment.

Deep Learning Thought Leaders

M: Who do you think of as having a computationally informed imagination?

J: From my perspective, the most imaginative and powerful computational technique today is deep learning. The biggest names in that space are probably Geoffrey Hinton (University of Toronto and now Google), Yann LeCun (New York University and now Facebook) and Yoshua Bengio, who's at University of Montreal.

M: I want to thank Jon for sharing some of his professional journey with us. You can find Jon on twitter @data_jk and find him on LinkedIn.