2021 projects, #4: learn something new

I have always sought out situations where I think I could learn something interesting. 2021 will not be different. Many of the tings I have learned over the past decade are by Brownian motion and absorption – spend enough time in a community, you will pick up their interests. So I thought it would be a nice thing to learn something in a more structured fashion.

If there is one thing – something not Covid – which has popped up repeatedly in the space around me this year, it is AI and machine learning. Many papers with new neuro tools using deep learning and other ML approaches. That fantastic protein folding study. The Semantic Scholar search enginge’s new amazing TLDR:s of papers. And I attended a very interesting webinar on the possible applications of AI in music (I think it was a Stockholm AI event, but their event calendar is bugging out on me currently, so I can’t verify).

What entices me is the amazing creativity I see in many approaches – there are obviously a wide range of AI applications that I had not been able to imagine.

So I have signed up for two courses, one translated into Swedish – mostly to pick up the Swedish terms if they even exist yet, also because the translation is fantastic – and one in English from Coursera.

My 2020 project was to learn Python. Since I effectively started it in September, I haven’t come that far yet (I am at the NumPy stage), so this will be a continuing project in 2021. Also, I was gifted my friend Benjamin Auffart’s new book on ML in Python, the Artificial Intelligence in Python Cookbook, so I have resources to make this another branch of the AI learning project.

It really *is* FORTRAN, all the way down

Two weeks ago I learned something that completely shifted what I thought I knew, and which seems to not be widely known. Thanks to Mike Croucher (Walking Randomly), whose blog I have followed as long as I can remember following blogs.

Did you know that programming languages are built on other programming languages? It should have been self-evident, perhaps, but I never really gave a thought to where programming languages actually come from. Second; did you know that large portions of Python and R are built on, and dependent on, FORTRAN?! (yes, I know its name is supposed to be part lowercase nowadays, but in my mind it remains FORTRAN77, like my student days – no, I am not ancient, but my teachers’ tools were).

The reason I blog this now is: I mentioned this the other day to an old friend who is a real bona fide computer scientist and researcher, and he DIDN’T know.

And as MC tells it, “Much of the numerical functionality we routinely use today was developed decades ago and released in Fortran. More modern systems, such as R, make direct use of a lot of this code because it is highly performant and, perhaps more importantly, has been battle tested in production for decades.  Numerical computing is hard (even when all of your instincts suggest otherwise) and when someone demonstrably does it right, it makes good sense to reuse rather than reinvent. As a result, with no Fortran, there’s no R.

And yeah, the same goes for Python, and a lot of other really basic (meaning not ‘simple’, but fundamental) numeric tools that are used everywhere. At least one of SciPy and NumPy has been a part of every Python toolset I have tried so far.

So what is the problem with many of our most indispensable software tools having a well-optimized FORTRAN skeleton? Compilers are the issue – the tailormade ways you tell the *really* basic low level parts of your specific processor what to do with your code. And processors are different. Hence, FORTRAN has over a dozen open source and commercial compilers adapted to different brands of processors – and for the commercial ones you have to trust that the organization that owns the compiler wants to continue supporting it. The currently most recommended compiler is commercial. However, open source efforts seem to have kept in step, more or less – though there are/were some question marks on whether everything FORTRAN-dependent would work well on Apple’s new Arm-based ‘Apple Silicon’ machines. (Imagine getting a shiny new expensive Apple computer – heavily marketed to people with rather vague knowledge of technical aspects – and everything goes wonky? The uproar.)

So, in essence, many modern open source computing resources rest on weaker legs than is really comfortable. It reminds me of that fantastic XKCD on dependencies. (There really is an XKCD for every tech scenario imaginable)