Friday, August 11, 2017

Readings in Visualization

"Ex-Libris" part V: Visualization


Part 5 of my "ex-libris" of a Data Scientist is now available. This one is about visualization.

Starting from a historical perspective, particularly of statistical visualization, and covering a few classic must have books, the article then goes on to cover graphic design, cartography, information architecture and design and concludes with many recent books on information visualization (specific Python and R books to create these were listed in part IV of this series). In all, about 66 books on the subject.

Just follow the link to the LinkedIn post to go directly to it:



From Jacques Bertin’s Semiology of Graphics

"Le plus court croquis m'en dit plus long qu'un long rapport", Napoleon Ier

See also

Part I was on "data and databases": "ex-libris" of a Data Scientist - Part i
Part II, was on "models": "ex-libris" of a Data Scientist - Part II

Part III, was on "technology": "ex-libris" of a Data Scientist - Part III
Part IV, was on "code": "ex-libris" of a Data Scientist - Part IV
Part VI will be on communication. Bonus after that will be on management / leadership.
Francois Dion
@f_dion

P.S.
Je vais aussi avoir une liste de publications en francais
En el futuro cercano voy a hacer una lista en espanol tambien

Tuesday, June 13, 2017

Readings in Programming

"Ex-Libris" part IV: Code


I've made available part 4 of my "ex-libris" of a Data Scientist. This one is about code. 

No doubt, many have been waiting for the list that is most related to Python.  In a recent poll by KDNuggets, the top tool used for analytics, data science and machine learning by respondents turned out to also be a programming language: Python.

The article goes from algorithms and theory, to approaches, to the top languages for data science, and more. In all, almost 80 books in just that part 4 alone. It can be found on LinkedIn:

"ex-libris" of a Data Scientist - Part IV

from Algorithms and Automatic Computing Machinesby B. A. Trakhtenbrot




See also


Part I was on "data and databases": "ex-libris" of a Data Scientist - Part i

Part II, was on "models": "ex-libris" of a Data Scientist - Part II



Part III, was on "technology": "ex-libris" of a Data Scientist - Part III

Part V will be on visualization, part VI on communication. Bonus after that will be on management / leadership.

Francois Dion
@f_dion

P.S.
Je vais aussi avoir une liste de publications en francais
En el futuro cercano voy a hacer una lista en espanol tambien

Saturday, June 3, 2017

Readings in Technology

"Ex-Libris" part III


I've made available part 3 of my "ex-libris" of a Data Scientist. This one is on Technology, from some historical perspective (ie. Turing, Shannon, Von Neuman) all the way to the most recent trends (ie. Ansible, Docker, Cloud, Continuous Integration, Performance etc) and can be found on LinkedIn:

"ex-libris" of a Data Scientist - Part III

My CES Industries EdLab model #804, in the History section


See also


Part I was on "data and databases": "ex-libris" of a Data Scientist - Part i

Part II, was on "models": "ex-libris" of a Data Scientist - Part II

Part IV is right around the corner, and will have a significant Python section.



Francois Dion
@f_dion

Friday, June 2, 2017

Raspberry Pi 3 Canakit

It's been a long time...


I don't even remember when was the last time I talked about Raspberry Pi hardware on my blog. I do remember the first time, however, some 5 years ago. Meanwhile, the price of Raspberry Pis have both gone down (Zero) and up (Raspberry Pi 3) due to economies of scale, Moore's law and removal / addition of components from the various versions.

Raspberry Pi 3


So, I realize that I've covered everything from the original Raspberry Pi model B with 256MB of ram, all the way to the Raspberry Pi 2 and Zero, but nothing on the Raspberry Pi 3. I typically don't buy kits, but I picked one up to see how that would work for people who had never used a Raspberry Pi before. The reason there is that I suggested to my fellow data scientists (and those interested by data science, and whoever else who reads my posts) to also be technologists, to get better acquainted with technology and hardware. And to get one (or four - I'll follow up on that) Raspberry Pi 3. I wanted to make sure it would be smooth sailing.

The Canakit



I ordered a Canakit with a Raspberry Pi board and two heatsinks, a case, a 2.5A power supply and a guide. It is convenient to get a single ready to go package quickly, particularly with free 2nd day shipping (you know who).

I say ready to go, but not quite. Of course you'll need a monitor, keyboard and mouse if you don't enable ssh (distro dependent ways, some have ssh enabled by default). But you'll also need an SD card.

No SD card?


The kit I ordered basically got you the case, heatsinks and power supply for $15 above the price of the Raspberry Pi 3. But no microSD card. The step above kit includes a microSD card with the OS already pre-installed, but when I looked at it, the price difference was too much for just that and a plain HDMI cable.

Oh, but it's so hard to prepare an SD card! Don't panic. Use Etcher. Later, when you get your command line skills up, you'll be able to do it with dd.

If you get stuck, leave a comment or contact me on twitter

Francois Dion
@f_dion

Monday, May 8, 2017

Readings in Data Science models

Ex-libris


As I've previously mentioned, I recently started a 6 part series on LinkedIn called "ex-libris" of a Data Scientist. 

Part I was on "data and databases": "ex-libris" of a Data Scientist - Part i

I just posted part II, "models": "ex-libris" of a Data Scientist - Part II

It does cover machine learning, but before going there, I cover Metrics, Operations Research, Econometrics and Time Series and Statistics. Even more fundamentally, I start with Math.

Yes, if you slept through linear algebra or calculus (or analysis as it is called in certain parts of the world), check the list out. Book suggestions and links to videos and other resources.


Also, a reminder: Python specific books will show up in part IV.



Francois Dion
@f_dion

Monday, April 24, 2017

Meet Eliza #AI



I will be presenting and directing a discussion on artificial intelligence, from various angles including the arts, Tuesday April 25th at Wake Forest in Winston Salem, NC.

Details here:
http://www.pyptug.org/2017/04/pyptug-monthly-meeting-meet-eliza-april.html

Francois Dion
@f_dion

Thursday, April 13, 2017

Readings in data and databases

Recent readings (can you guess/decipher some of them?)

I've been fairly quiet on this particular blog this year. Beside a lot of data science work, I've done presentations at meetups and conferences, including a recent tutorial on "Getting to know your data at scale" at the IEEE SouthEastCon 2017. Notebooks will be posted on github soon.

But, in the meantime...

Ex-libris

Something else I've been doing is publishing a few articles here and there. Just recently, I started a 6 part series on LinkedIn called "ex-libris" of a Data Scientist. I think many readers of this blog will appreciate this series, and particularly this first installment on "data and databases":

"ex-libris" of a Data Scientist - Part i

It covers a good variety of books on the subject, some pretty much must read for whatever corner of the computer science world you live in. Also of interest will be the Postgres, Hadoop and graph database pointers and a list of over 20 curated must read papers in the field.

Python specific books will show up in part IV.


Francois Dion
@f_dion