Machine Learning, C++ and You

评价:
0
(0用户)

Machine learning (ML) is a hot area of activity in the programming world these days. And, generally speaking, the ML programming language weapon of choice is Python. Practically all new ML research projects, libraries and frameworks are coded in Python or provide Python bindings exclusively. Even most pre-trained learning models provide Python-only interfaces.

This poses a challenge for C-language family programmers. One has to either craft or adapt C++ interfaces to this ML code intended for Python, or move to Python itself.  Admittedly, Python is fairly straight-ahead where C-style syntax is concerned, and most C coders should be able to get up to speed easily, but are likely to be unable to put up with the speed easily.

Many novice programmers don’t realize it, but Python is an interpreted language. As such, Python code is vastly less performant than C++ code. Performance gaps on the order of 100x or even 1000x are not uncommon. This poor performance is frequently acceptable for interface code for GPU-accelerated libraries and frameworks, where the bulk of the runtime is spent in computational kernels and “programmer productivity” is of paramount concern. But in many scenarios, Python represents a significant challenge to performance and ultimately productivity, as programmers struggle to work around Python’s limitations.

That sounds bad. By now, you might be asking yourself “If all of that is true, why is Python so popular in the machine learning space?” The answer: coding neural nets and other AI models from scratch is not trivial. The large body of available code for Python allows for “ML script kiddies”. People who barely know how to program at all, and who know nothing about the concepts underlying the lowest levels of ML machinery, are able to get results, however poor, slow and unstable, with moderate effort.

An alternative approach for C++ coders, and one I advocate, is to roll you own ML code. As mentioned before, this is not trivial, and that brings me to the purpose of this post.

While machine learning with Python is the new hotness, the old hotness was machine learning with C++ 30 years ago. And some of the resources from back then might prove useful to you, my fellow C++ specialists who don’t love Python, today.

A tiny list to get going, for free and license unencumbered:

Books – available at archive.org for free check-out:

C++ neural networks and fuzzy logic (1993)
Neural network and fuzzy logic applications in C/C++ (1994)

and public domain C++ implementations of various algorithms for ML:

MLC++ – A Machine Learning Library in C++ (1997)

The implementation is from the Stanford research group which authored a paper of the same name and the now defunct Silicon Graphics. The code and utilities are in the public domain, and can be downloaded from this very site.

To be continued…

本文为原创文章,转载请注明出处!

注册并通过认证的用户才可以进行评价!

admin:系统自动奖励,+10,  

发表评论