## Monday, April 18, 2016

### Los Alamos 10742: The Making of

 Modern rendering of the original 1947 Memo 10742

If you've not read the first part (The return of the Los Alamos Memo 10742) of this blog, go there now. There will be a link to come back here at the end, so you don't forget ...

If you remember, in the previous article, I had asked the students (and, you, the reader) to try this exercise:

"Replicate either:
a) the whole memo
or
b) the list of numbers
Whichever assignment you choose, the numbers must be generated programmatically."

One possible way

We'll use Python 3 and do b):

In [1]:
```def num_to_words(n):
"""Returns a number in words, covering 0 to 100 inclusive."""
n2w = {
0: 'zero', 1: 'one', 2: 'two', 3: 'three', 4: 'four', 5: 'five', 6: 'six',
7: 'seven', 8: 'eight', 9: 'nine', 10: 'ten', 11: 'eleven', 12: 'a dozen',
13: 'thirteen', 14: 'fourteen', 15: 'fifteen', 16: 'sixteen', 17: 'seventeen',
18: 'eighteen', 19: 'nineteen',
20: 'twenty', 30: 'thirty', 40: 'fourty', 50: 'fifty', 60: 'sixty', 70: 'seventy',
80: 'eighty', 90: 'ninety', 100: 'one hundred'
}
try:
return n2w[n]
except KeyError:
return n2w[n-n%10] + ' ' + n2w[n%10]
```
The famous twelve as 'a dozen'
In [2]:
```num_to_words(12)
```
Out[2]:
`'a dozen'`
In [3]:
```num_to_words(7)
```
Out[3]:
`'seven'`
In [4]:
```num_to_words(67)
```
Out[4]:
`'sixty seven'`
In [5]:
```num_to_words(100)
```
Out[5]:
`'one hundred'`
Generating the alphabetical word list, not including number 10
In [6]:
```word_tuples = sorted([(num_to_words(num),num) for num in range(101) if num != 10])
```
Now that the list is sorted alphabetically, just want the second item of each tuple [1]
In [7]:
```result = list(zip(*word_tuples))[1]
```
Let's print this.
In [8]:
```print(str(result)[1:-1])
```
```12, 8, 18, 80, 88, 85, 84, 89, 81, 87, 86, 83, 82, 11, 15, 50, 58, 55, 54, 59, 51, 57, 56, 53, 52, 5, 4, 14, 40, 48, 45, 44, 49, 41, 47, 46, 43, 42, 9, 19, 90, 98, 95, 94, 99, 91, 97, 96, 93, 92, 1, 100, 7, 17, 70, 78, 75, 74, 79, 71, 77, 76, 73, 72, 6, 16, 60, 68, 65, 64, 69, 61, 67, 66, 63, 62, 13, 30, 38, 35, 34, 39, 31, 37, 36, 33, 32, 3, 20, 28, 25, 24, 29, 21, 27, 26, 23, 22, 2, 0
```
In [ ]:

If you read the commentaries for the previous article on the subject, you surely ran into Edward Carney's almost working proposed solution. I am adding it here as another way of attacking the problem. Edward used a module named num2words. As you'll discover over years of writing python code, most anything you can think of has already been done. And in some cases, multiple times.

Why did I say almost working? Let's see if somebody finds the issue. If not I'll post the correction in a future post (the very next one will diverge from this subject to talk about fractals). I'll also introduce the inflect module and since we're introducing some NLP concepts, I'll bring in NLTK too.

In [1]:
`import num2words as n2w`
In [2]:
`key_set = []`
`[key_set.append(n2w.num2words(i)) for i in list(range(101))]`
`key_set[12] = 'dozen'`
`key_set[100] = 'one hundred'`
`numset_dict = dict(zip(key_set,list(range(101))))`
`line_breaks = [14, 30, 46, 62, 78, 94]`
`for i, k in enumerate(yvals):`
`    print('{} '.format(k[1]),end='')`
`    if i in line_breaks:`
`        print('\n')`
```---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-2-6c7998a49267> in <module>()
5 numset_dict = dict(zip(key_set,list(range(101))))
6 line_breaks = [14, 30, 46, 62, 78, 94]
----> 7 for i, k in enumerate(yvals):
8     print('{} '.format(k[1]),end='')
9     if i in line_breaks:

NameError: name 'yvals' is not defined

```
```
```
```
```
`You know the solution? Post it in the comments section.`
```
```
```Francois Dion
@f_dion```

## Monday, March 14, 2016

### The date is the title...

 J. Venn - Logic of Chance

### Turtle Graphics?

The above, looks suspiciously like a printout from my first session with Apple Logo (the language, not the branding), before I figured the command for "pen up"...

A few months back, I was reading a few books and found the above in one of them. It is titled "Logic of Chance", by John Venn (mostly known for the Venn diagram). The year? 1866.

So, where were we? Ah yes...

### 3/14/16

Yes, that famous sequence of number. What was the story with John Venn and pi, here? Whereas I used digits 0-9 in "the 10 colors of pi", John used digits 0-7, discarding all 8s and 9s. Since back then there were no computers, he picked his numbers from a book (by R. Shank) which had 707 digits of pi, leaving him with 568 digits between 0 and 7. He mapped 0 to 7 to directions (10 directions might have felt a bit odd, at 36 degrees, versus nice 45 degree lines):

Although he doesn't specify the mapping, it is easy to infer from the graph. The first digit after the decimal is 1, then 4 and we can see the path as NE, then S, so:

 0 N 1 NE 2 E 3 SE 4 S 5 SW 6 W 7 NW

### The random walk

He would then move by 1 unit in the direction of each digit / direction mapping. NE, S, NE, SW, skip 9, E, so on and so forth. (NB: This is easy to reproduce in python with the turtle module. A quick search of my blog will get you started on this, from a pi generator to import turtle.)

His conclusion stated:
"The result seems to me to furnish a very fair graphical indication of randomness".

Francois Dion
@f_dion

## Saturday, March 5, 2016

### The return of the Los Alamos Memo 10742 -

 Modern rendering of the original 1947 Memo 10742

### The mathematician prankster

Can you imagine yourself receiving this memo in your inbox in Washington in 1947? There's a certain artistic je ne sais quoi in this memo...

This prank was made by J Carson Mark and Stan Ulam.  A&S was Administration and Services.

And Ulam, well known for working on the Manhattan project, also worked on really interesting things in mathematics. Specifically, a collaboration with Nicholas Constantine Metropolis and John Von Neumann. You might know this as the Monte Carlo method (so named due to Ulam's uncle always asking for money to go and gamble in a Monte Carlo casino...). Some people have learned about a specific Monte Carlo simulation (the first) known as Buffon's needle.

### Copying the prankster

When I stumbled upon this many years ago, I decided that it would make a fantastic programming challenge for a workshop and/or class. I first tried it in a Java class, but people didn't get quite into it. Many years later I redid it as part of a weekly Python class I was teaching at a previous employer.

The document is the output of a Python script. In order to make the memo look like it came from the era, I photocopied it. It still didn't look quite right, so I then scanned that into Gimp, bumped the Red and Blue in the color balance tool to give it that stencil / mimeograph / ditto look.

Here is what I asked the students:

"Replicate either:
a) the whole memo
or
b) the list of numbers
Whichever assignment you choose, the numbers must be generated programmatically."

That was basically it. So, go ahead and try it. In Python. Or in R, or whatever you fancy and post a solution as a comment.

We will come back in some days (so everybody gets a chance to try it) and present some possible methods of doing this. Oh, and why the title of "the return of the Los Alamos Memo"? Well, I noticed I had blogged about it before some years back, but never detailed it...

### Learning more on Stan Ulam

See the wikipedia entry and also:

### LOS ALAMOS SCIENCE NO. 15, 1987

[EDIT: Part 2 is at: los-alamos-10742-making-of.html]

Francois Dion
@f_dion

## Monday, January 4, 2016

### Stack overflow en espanol

En caso que no lo ha encontrado, el sitio stack overflow ahora es disponible en espaÃ±ol. Y, no todas las respuestas son las mismas que la del stack overflow en ingles. Hay una buena cantidad de contenido exclusivo.

## #Python

Por ejemplo, alguien pregunto: CÃ³mo instalar MySQLdb en OS X?

Hay varias respuestas, pero yo se que la mÃ­a es algo que yo he escrito solamente en espaÃ±ol:

`Mysql-python` solo es compatible con `python 2` (Python3 WOS), y el `pip` es de `python 3`:
``\$ which pip``
Muy probablemente devolverÃ¡ algo similar a:
``/Library/Frameworks/Python.framework/Versions/3.x/bin/pip``
Para hacer la instalaciÃ³n bajo `python 2`, hay que seleccionar el `pip` de `python 2`:
``\$ sudo pip2 install MySQL-python``
La otra opciÃ³n es un mÃ³dulo puro `python` que es compatible `python 2` y `3`, como pymysql.
Al final, para evitar los conflictos de versiones y tambiÃ©n los `python` de Apple (con varios problemas) es mejor hacer la instalaciÃ³n de `python 2.x` y `3.x` con `homebrew`, y utilizar virtualenv que permite la creaciÃ³n de entornos virtuales `python`, cada cual con solo los requisitos para el entorno. Sin entornos virtuales hay que siempre ser explicito: `pip2` o `pip3` en vez de `pip`.

Francois Dion
@f_dion

## Yet it is also just the beginning

This is not going a long review of the year. Perhaps in January I'll do that. But I did want to point out that it was a good year for python. Earlier this month I looked at the TIOBE ratings for Python, R and Scala, the main languages I use on a regular basis (and in decreasing order of use by me - I might do java, C++ or javascript on occasion, but not on a regular basis anymore):

And that was a peak for Python at #4. Back in 2007, you might remember, TIOBE had named Python "Language of the year". And if we do a quick check on google trends of a good indicator of the popularity worldwide ("learn python"), we see that this is when it started to pick up some steam. For some fun, I'm comparing to "learn java" (ranked #1 on latest TIOBE rating):

Hey wait, what is going on in November / December 2015? :P

Let's zoom in and take a closer look at 2015:

It appears it might be overtaking Java there... It is quite early to really see if this is just a fluke, only the next few months will reveal this.

## That credit card sized computer thingy

What is also worth mentioning is that the level of interest in learning Python and the Raspberry Pi seem to follow a similar path, but that will be for a follow up post. See you next year!

Francois Dion
@f_dion

## How fast are they?

 Added full size version, just click on the above
I posted the above on linkedin earlier this month. I hinted at the code in the header picture, but no code. Ok, so let's get into some code here.

WARNING: Once you discover this API, you are guaranteed to wastespend a lot of time with it. It's not too late to turn around, you've been warned!

## REST API

The main thing I want to point out tonight, is how easy it is to interact with web services in python. In this case, the Star Wars API (swapi). First, import the usual suspects for visualization and analytics, and add some json handling:

In [1]:
`%matplotlib inline`
`import matplotlib.patches as mpatches`
`import matplotlib.pyplot as plt`
`import numpy as np`
`import pandas as pd`
`import seaborn as sns`
In [2]:
`import requests`
`import json`
`from pandas.io.json import json_normalize`

Ok, what next? Let's first check the swapi api:

In [4]:
`r = requests.get('http://swapi.co/api/')`
`urls = r.json()`
`urls`
Out[4]:
```{'films': 'http://swapi.co/api/films/',
'people': 'http://swapi.co/api/people/',
'planets': 'http://swapi.co/api/planets/',
'species': 'http://swapi.co/api/species/',
'starships': 'http://swapi.co/api/starships/',
'vehicles': 'http://swapi.co/api/vehicles/'}```
` `

Each service will spit back out only part of the data, so we need to follow the link to the next page. Let's write up a function, something quick as a helper. It is not the most efficient, but it is the most readable way of doing this, and for a blog that's important.
In [5]:
`def get_swapi(url):`
`    r = requests.get(url)`
`    data = r.json()`
`    df = json_normalize(data['results'])`
`    while len(df.index) < data['count']:`
`        r = requests.get(data['next'])`
`        data = r.json()`
`        df = pd.concat([df,json_normalize(data['results'])])`
`    return df`

## Chewie, we're home

And now to use it:

In [6]:
`df = get_swapi(urls['starships'])`
Finally, let's clean up the data to correct the 1000km into 1000, remove the unknowns etc, then we can display the full table.

In [7]:
`df['max_atmosphering_speed'][df['max_atmosphering_speed']=='1000km'] = 1000`
`df = df[~(df['hyperdrive_rating']=='unknown')]`
`df = df[~(df['max_atmosphering_speed'].isin(['unknown','n/a']))]`
In [8]:
`df['max_atmosphering_speed'] = df['max_atmosphering_speed'].astype(int)`
In [9]:
`df['hyperdrive_rating'] = df['hyperdrive_rating'].astype(float)`
`df.sort_values(by='hyperdrive_rating', inplace=True)`
`df`
Out[9]:

MGLT cargo_capacity consumables cost_in_credits created crew edited films hyperdrive_rating length manufacturer max_atmosphering_speed model name passengers pilots starship_class url
2 75 100000 2 months 100000 2014-12-10T16:59:45.094000Z 4 2014-12-22T17:35:44.464156Z [http://swapi.co/api/films/7/, http://swapi.co... 0.5 34.37 Corellian Engineering Corporation 1050 YT-1300 light freighter Millennium Falcon 6 [http://swapi.co/api/people/13/, http://swapi.... Light freighter http://swapi.co/api/starships/10/
0 unknown unknown unknown unknown 2014-12-20T19:55:15.396000Z 3 2014-12-22T17:35:45.258859Z [http://swapi.co/api/films/6/] 0.5 29.2 Theed Palace Space Vessel Engineering Corps/Nu... 1050 J-type star skiff Naboo star skiff 3 [http://swapi.co/api/people/10/, http://swapi.... yacht http://swapi.co/api/starships/64/
7 unknown unknown 1 year 2000000 2014-12-20T11:05:51.237000Z 5 2014-12-22T17:35:45.124386Z [http://swapi.co/api/films/5/] 0.7 39 Theed Palace Space Vessel Engineering Corps, N... 2000 J-type diplomatic barge J-type diplomatic barge 10 [] Diplomatic barge http://swapi.co/api/starships/43/
0 unknown unknown unknown unknown 2014-12-20T17:46:46.847000Z 4 2014-12-22T17:35:45.158969Z [http://swapi.co/api/films/5/] 0.9 47.9 Theed Palace Space Vessel Engineering Corps 8000 H-type Nubian yacht H-type Nubian yacht unknown [http://swapi.co/api/people/35/] yacht http://swapi.co/api/starships/49/
2 100 110 5 days 155000 2014-12-20T20:03:48.603000Z 3 2014-12-22T17:35:45.287214Z [http://swapi.co/api/films/6/] 1.0 14.5 Incom Corporation, Subpro Corporation 1000 Aggressive Reconnaissance-170 starfighte arc-170 0 [] starfighter http://swapi.co/api/starships/66/
1 unknown 60 2 days 320000 2014-12-20T19:56:57.468000Z 1 2014-12-22T17:35:45.272349Z [http://swapi.co/api/films/6/] 1.0 5.47 Kuat Systems Engineering 1500 Eta-2 Actis-class light interceptor Jedi Interceptor 0 [http://swapi.co/api/people/10/, http://swapi.... starfighter http://swapi.co/api/starships/65/
9 unknown 20000000 2 years 59000000 2014-12-20T19:52:56.232000Z 7400 2014-12-22T17:35:45.224540Z [http://swapi.co/api/films/6/] 1.0 1137 Kuat Drive Yards, Allanteen Six shipyards 975 Senator-class Star Destroyer Republic attack cruiser 2000 [] star destroyer http://swapi.co/api/starships/63/
3 unknown 50000 56 days 1000000 2014-12-20T19:48:40.409000Z 5 2014-12-22T17:35:45.208584Z [http://swapi.co/api/films/6/] 1.0 18.5 Cygnus Spaceworks 2000 Theta-class T-2c shuttle Theta-class T-2c shuttle 16 [] transport http://swapi.co/api/starships/61/
9 unknown 60 7 days 180000 2014-12-20T17:35:23.906000Z 1 2014-12-22T17:35:45.147746Z [http://swapi.co/api/films/5/, http://swapi.co... 1.0 8 Kuat Systems Engineering 1150 Delta-7 Aethersprite-class interceptor Jedi starfighter 0 [http://swapi.co/api/people/10/, http://swapi.... Starfighter http://swapi.co/api/starships/48/
5 unknown 60 15 hours 102500 2014-12-20T20:43:04.349000Z 1 2014-12-22T17:35:45.396711Z [http://swapi.co/api/films/6/] 1.0 7.9 Kuat Systems Engineering 1050 Alpha-3 Nimbus-class V-wing starfighter V-wing 0 [] starfighter http://swapi.co/api/starships/75/
4 unknown 65 7 days 200000 2014-12-19T17:39:17.582000Z 1 2014-12-22T17:35:45.079452Z [http://swapi.co/api/films/5/, http://swapi.co... 1.0 11 Theed Palace Space Vessel Engineering Corps 1100 N-1 starfighter Naboo fighter 0 [http://swapi.co/api/people/11/, http://swapi.... Starfighter http://swapi.co/api/starships/39/
0 70 180000 1 month 240000 2014-12-10T15:48:00.586000Z 5 2014-12-22T17:35:44.431407Z [http://swapi.co/api/films/1/] 1.0 38 Sienar Fleet Systems, Cyngus Spaceworks 1000 Sentinel-class landing craft Sentinel-class landing craft 75 [] landing craft http://swapi.co/api/starships/5/
3 80 110 1 week 134999 2014-12-12T11:00:39.817000Z 2 2014-12-22T17:35:44.479706Z [http://swapi.co/api/films/3/, http://swapi.co... 1.0 14 Koensayr Manufacturing 1000 BTL Y-wing Y-wing 0 [] assault starfighter http://swapi.co/api/starships/11/
1 120 40 1 week 175000 2014-12-18T11:16:34.542000Z 1 2014-12-22T17:35:44.978754Z [http://swapi.co/api/films/3/] 1.0 9.6 Alliance Underground Engineering, Incom Corpor... 1300 RZ-1 A-wing Interceptor A-wing 0 [http://swapi.co/api/people/29/] Starfighter http://swapi.co/api/starships/28/
4 100 110 1 week 149999 2014-12-12T11:19:05.340000Z 1 2014-12-22T17:35:44.491233Z [http://swapi.co/api/films/3/, http://swapi.co... 1.0 12.5 Incom Corporation 1050 T-65 X-wing X-wing 0 [http://swapi.co/api/people/1/, http://swapi.c... Starfighter http://swapi.co/api/starships/12/
5 105 150 5 days unknown 2014-12-12T11:21:32.991000Z 1 2014-12-22T17:35:44.549047Z [http://swapi.co/api/films/1/] 1.0 9.2 Sienar Fleet Systems 1200 Twin Ion Engine Advanced x1 TIE Advanced x1 0 [http://swapi.co/api/people/4/] Starfighter http://swapi.co/api/starships/13/
8 50 80000 2 months 240000 2014-12-15T13:04:47.235000Z 6 2014-12-22T17:35:44.795405Z [http://swapi.co/api/films/3/, http://swapi.co... 1.0 20 Sienar Fleet Systems 850 Lambda-class T-4a shuttle Imperial shuttle 20 [http://swapi.co/api/people/1/, http://swapi.c... Armed government transport http://swapi.co/api/starships/22/
6 unknown 2500000 30 days 55000000 2014-12-20T09:39:56.116000Z 1 2014-12-22T17:35:45.105522Z [http://swapi.co/api/films/4/] 1.5 26.5 Republic Sienar Systems 1180 Star Courier Scimitar 6 [http://swapi.co/api/people/44/] Space Transport http://swapi.co/api/starships/41/
2 unknown 50000000 4 years 125000000 2014-12-20T19:40:21.902000Z 600 2014-12-22T17:35:45.195165Z [http://swapi.co/api/films/6/] 1.5 1088 Rendili StarDrive, Free Dac Volunteers Enginee... 1050 Providence-class carrier/destroyer Trade Federation cruiser 48247 [http://swapi.co/api/people/10/, http://swapi.... capital ship http://swapi.co/api/starships/59/
8 unknown 240 7 days 35700 2014-12-20T18:37:56.969000Z 3 2014-12-22T17:35:45.183075Z [http://swapi.co/api/films/5/] 1.5 15.2 Huppla Pasa Tisc Shipwrights Collective 1600 Punworcca 116-class interstellar sloop Solar Sailer 11 [] yacht http://swapi.co/api/starships/58/
5 unknown unknown unknown unknown 2014-12-19T17:45:03.506000Z 8 2014-12-22T17:35:45.091925Z [http://swapi.co/api/films/4/] 1.8 76 Theed Palace Space Vessel Engineering Corps, N... 920 J-type 327 Nubian royal starship Naboo Royal Starship unknown [http://swapi.co/api/people/39/] yacht http://swapi.co/api/starships/40/
6 60 3000000 1 year 3500000 2014-12-10T14:20:33.369000Z 165 2014-12-22T17:35:45.408368Z [http://swapi.co/api/films/6/, http://swapi.co... 2.0 150 Corellian Engineering Corporation 950 CR90 corvette CR90 corvette 600 [] corvette http://swapi.co/api/starships/2/
1 60 36000000 2 years 150000000 2014-12-10T15:08:19.848000Z 47060 2014-12-22T17:35:44.410941Z [http://swapi.co/api/films/3/, http://swapi.co... 2.0 1,600 Kuat Drive Yards 975 Imperial I-class Star Destroyer Star Destroyer 0 [] Star Destroyer http://swapi.co/api/starships/3/
9 40 6000000 2 years 8500000 2014-12-15T13:06:30.813000Z 854 2014-12-22T17:35:44.848329Z [http://swapi.co/api/films/3/, http://swapi.co... 2.0 300 Kuat Drive Yards 800 EF76 Nebulon-B escort frigate EF76 Nebulon-B escort frigate 75 [] Escort ship http://swapi.co/api/starships/23/
2 91 45 1 week 220000 2014-12-18T11:18:04.763000Z 1 2014-12-22T17:35:45.011193Z [http://swapi.co/api/films/3/] 2.0 16.9 Slayn & Korpil 950 A/SF-01 B-wing starfighter B-wing 0 [] Assault Starfighter http://swapi.co/api/starships/29/
3 unknown unknown unknown unknown 2014-12-19T17:01:31.488000Z 9 2014-12-22T17:35:45.027308Z [http://swapi.co/api/films/4/] 2.0 115 Corellian Engineering Corporation 900 Consular-class cruiser Republic Cruiser 16 [] Space cruiser http://swapi.co/api/starships/31/
7 70 70000 1 month unknown 2014-12-15T13:00:56.332000Z 1 2014-12-22T17:35:44.716273Z [http://swapi.co/api/films/5/, http://swapi.co... 3.0 21.5 Kuat Systems Engineering 1000 Firespray-31-class patrol and attack Slave 1 6 [http://swapi.co/api/people/22/] Patrol craft http://swapi.co/api/starships/21/
5 20 19000000 6 months unknown 2014-12-15T12:34:52.264000Z 6 2014-12-22T17:35:44.680838Z [http://swapi.co/api/films/3/, http://swapi.co... 4.0 90 Gallofree Yards, Inc. 650 GR-75 medium transport Rebel transport 90 [] Medium transport http://swapi.co/api/starships/17/
4 unknown 140 7 days 168000 2014-12-20T20:38:05.031000Z 1 2014-12-22T17:35:45.381900Z [http://swapi.co/api/films/6/] 6.0 6.71 Feethan Ottraw Scalable Assemblies 1100 Belbullab-22 starfighter Belbullab-22 starfighter 0 [http://swapi.co/api/people/10/, http://swapi.... starfighter http://swapi.co/api/starships/74/

That'll be it for tonight, enjoy playing around with swapi, but as the intro mentions, be warned (and it is the disclaimer I've used each time I've introduced the api into a data science class), you can spend a lot of time with this...

Francois Dion
@f_dion

## Preamble

Sometimes, you don't have direct access to the data, or the data changes over time.

Yeah, I know, scary. So that's my point in this post. Provide a URL to a "frozen" version of your data, if at all possible. Toward the end of the article I provide a link to the notebook. This repo also holds the data I used for the visualization.

Let's get right into it...

In [1]:
```%matplotlib inline
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
sns.set_context("talk")
```

# Reproducible visualization

In "The Functional Art: An introduction to information graphics and visualization" by Alberto Cairo, on page 12 we are presented with a visualization of UN data time series of Fertility rate (average number of children per woman) per country:

Figure 1.6 Highlighting the relevant, keeping the secondary in the background.

Book url:
The Functional Art

Let's try to reproduce this.

## Getting the data

The visualization was done in 2012, but limited the visualization to 2010.

This should make it easy, in theory, to get the data, since it is historical. These are directly available as excel spreadsheets now, we'll just ignore the last bucket (2010-2015).

In [3]:
`!wget 'http://esa.un.org/unpd/wpp/DVD/Files/1_Indicators%20(Standard)/`
```EXCEL_FILES/2_Fertility/WPP2015_FERT_F04_TOTAL_FERTILITY.XLS'
```
`--2015-12-29 16:57:23--  http://esa.un.org/unpd/wpp/DVD/Files/`
`1_Indicators%20(Standard)/EXCEL_FILES/2_Fertility/`
```WPP2015_FERT_F04_TOTAL_FERTILITY.XLS
Resolving esa.un.org... 157.150.185.69
Connecting to esa.un.org|157.150.185.69|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 869376 (849K) [application/vnd.ms-excel]
Saving to: 'WPP2015_FERT_F04_TOTAL_FERTILITY.XLS'

WPP2015_FERT_F04_TO 100%[=====================>] 849.00K   184KB/s   in 4.6s

2015-12-29 16:57:28 (184 KB/s) - ```
```'WPP2015_FERT_F04_TOTAL_FERTILITY.XLS' saved [869376/869376]

```

### World Population Prospects: The 2015 Revision

File FERT/4: Total fertility by major area, region and country, 1950-2100 (children per woman)
``````Estimates, 1950 - 2015
POP/DB/WPP/Rev.2015/FERT/F04
Suggested citation: United Nations, Department of Economic and Social Affairs,``````
``Population Division (2015). ``
``World Population Prospects: The 2015 Revision, DVD Edition.``

In [2]:
`df = pd.read_excel('WPP2015_FERT_F04_TOTAL_FERTILITY.XLS', skiprows=16,  `
```                   index_col = 'Country code')
df = df[df.index < 900]
```

In [3]:
```len(df)
```
Out[3]:
`201`

In [4]:
```df.head()
```
Out[4]:

Index Variant Major area, region, country or area * Notes 1950-1955 1955-1960 1960-1965 1965-1970 1970-1975 1975-1980 1980-1985 1985-1990 1990-1995 1995-2000 2000-2005 2005-2010 2010-2015
Country code

108 15 Estimates Burundi NaN 6.8010 6.8570 7.0710 7.2680 7.3430 7.4760 7.4280 7.5920 7.4310 7.1840 6.908 6.523 6.0756
174 16 Estimates Comoros NaN 6.0000 6.6010 6.9090 7.0500 7.0500 7.0500 7.0500 6.7000 6.1000 5.6000 5.200 4.900 4.6000
262 17 Estimates Djibouti NaN 6.3120 6.3874 6.5470 6.7070 6.8450 6.6440 6.2570 6.1810 5.8500 4.8120 4.210 3.700 3.3000
232 18 Estimates Eritrea NaN 6.9650 6.9650 6.8150 6.6990 6.6200 6.6200 6.7000 6.5100 6.2000 5.6000 5.100 4.800 4.4000
231 19 Estimates Ethiopia NaN 7.1696 6.9023 6.8972 6.8691 7.1038 7.1838 7.4247 7.3673 7.0888 6.8335 6.131 5.258 4.5889

First problem... The book states on page 8:
--
Yet we have 201 countries (codes 900+ are regions) with complete data. We do not have a easy way to identify which countries were added to this. Still, let's move forward and prep our data.
In [5]:
```df.rename(columns={df.columns[2]:'Description'}, inplace=True)
```
In [6]:
```df.drop(df.columns[[0, 1, 3, 16]], axis=1, inplace=True) # drop what we dont need
```
In [7]:
```df.head()
```
Out[7]:

Description 1950-1955 1955-1960 1960-1965 1965-1970 1970-1975 1975-1980 1980-1985 1985-1990 1990-1995 1995-2000 2000-2005 2005-2010
Country code

108 Burundi 6.8010 6.8570 7.0710 7.2680 7.3430 7.4760 7.4280 7.5920 7.4310 7.1840 6.908 6.523
174 Comoros 6.0000 6.6010 6.9090 7.0500 7.0500 7.0500 7.0500 6.7000 6.1000 5.6000 5.200 4.900
262 Djibouti 6.3120 6.3874 6.5470 6.7070 6.8450 6.6440 6.2570 6.1810 5.8500 4.8120 4.210 3.700
232 Eritrea 6.9650 6.9650 6.8150 6.6990 6.6200 6.6200 6.7000 6.5100 6.2000 5.6000 5.100 4.800
231 Ethiopia 7.1696 6.9023 6.8972 6.8691 7.1038 7.1838 7.4247 7.3673 7.0888 6.8335 6.131 5.258
In [8]:
```highlight_countries = ['Niger','Yemen','India',
'Brazil','Norway','France','Sweden','United Kingdom',
'Spain','Italy','Germany','Japan', 'China'
]
```
In [9]:
```# Subset only countries to highlight, transpose for timeseries
df_high = df[df.Description.isin(highlight_countries)].T[1:]
```
In [10]:
```# Subset the rest of the countries, transpose for timeseries
df_bg = df[~df.Description.isin(highlight_countries)].T[1:]
```

## Let's make some art

In [11]:
```# background
ax = df_bg.plot(legend=False, color='k', alpha=0.02, figsize=(12,12))
ax.xaxis.tick_top()

# highlighted countries
df_high.plot(legend=False, ax=ax)

# replacement level line
ax.hlines(y=2.1, xmin=0, xmax=12, color='k', alpha=1, linestyle='dashed')

# Average over time on all countries
df.mean().plot(ax=ax, color='k', label='World\naverage')

# labels for highlighted countries on the right side
for country in highlight_countries:
ax.text(11.2,df[df.Description==country].values[0][12],country)

# start y axis at 1
ax.set_ylim(ymin=1)
```
Out[11]:
`(1, 9.0)`
For one thing, the line for China doesn't look like the one in the book. Concerning. The other issue is that there are some lines that are going lower than Italy or Spain in 1995-2000 and in 2000-2005 (majority in the Balkans) and that were not on the graph in the book, AFAICT:
In [12]:
```df.describe()
```
Out[12]:

1950-1955 1955-1960 1960-1965 1965-1970 1970-1975 1975-1980 1980-1985 1985-1990 1990-1995 1995-2000 2000-2005 2005-2010
count 201.00000 201.000000 201.000000 201.000000 201.000000 201.000000 201.000000 201.000000 201.000000 201.000000 201.000000 201.000000
mean 5.45045 5.495005 5.491424 5.265483 4.994911 4.657349 4.403227 4.122837 3.762972 3.412293 3.141556 2.992349
std 1.64388 1.674181 1.734726 1.849984 1.944553 2.039995 2.033660 1.952100 1.849278 1.791151 1.701363 1.562150
min 1.98000 1.950000 1.850000 1.810000 1.623000 1.407900 1.427300 1.349700 1.240000 0.870000 0.825200 0.937900
25% 4.27700 4.201000 4.273100 3.447000 2.990000 2.540200 2.301500 2.230000 2.050000 1.889100 1.806100 1.818200
50% 5.99500 6.134100 6.129700 5.950000 5.470000 4.974900 4.370000 3.800000 3.343000 2.941500 2.600000 2.479300
75% 6.70000 6.764000 6.800000 6.707000 6.700000 6.525000 6.315000 5.900000 5.217000 4.637000 4.210000 3.980000
max 8.00000 8.150000 8.200000 8.200000 8.284000 8.500000 8.800000 8.800000 8.200000 7.746600 7.720900 7.678700
In [13]:
```df[df['1995-2000']<1.25]
```
Out[13]:

Description 1950-1955 1955-1960 1960-1965 1965-1970 1970-1975 1975-1980 1980-1985 1985-1990 1990-1995 1995-2000 2000-2005 2005-2010
Country code

344 China, Hong Kong SAR 4.4400 4.7200 5.3100 3.6450 3.2900 2.3100 1.7150 1.3550 1.2400 0.8700 0.9585 1.0257
446 China, Macao SAR 4.3858 5.1088 4.4077 2.7367 1.7930 1.4079 1.9769 1.9411 1.4050 1.1160 0.8252 0.9379
100 Bulgaria 2.5264 2.2969 2.2171 2.1304 2.1573 2.1927 2.0149 1.9458 1.5527 1.2008 1.2404 1.5005
203 Czech Republic 2.7383 2.3765 2.2088 1.9573 2.2108 2.3588 1.9660 1.9008 1.6455 1.1670 1.1870 1.4286
643 Russian Federation 2.8500 2.8200 2.5500 2.0200 2.0300 1.9400 2.0400 2.1210 1.5450 1.2470 1.2980 1.4389
804 Ukraine 2.8100 2.7000 2.1346 2.0204 2.0789 1.9798 2.0040 1.8968 1.6208 1.2404 1.1455 1.3828
428 Latvia 2.0000 1.9500 1.8500 1.8100 2.0000 1.8745 2.0293 2.1309 1.6322 1.1722 1.2856 1.4926
380 Italy 2.3550 2.2900 2.5040 2.4989 2.3227 1.8856 1.5245 1.3497 1.2715 1.2239 1.2974 1.4169
705 Slovenia 2.6800 2.3833 2.3354 2.2650 2.1999 2.1632 1.9280 1.6517 1.3335 1.2483 1.2114 1.3841
724 Spain 2.5300 2.7000 2.8100 2.8400 2.8500 2.5500 1.8800 1.4600 1.2800 1.1900 1.2900 1.3904
In [14]:
```df[df['2000-2005']<1.25]
```
Out[14]:

Description 1950-1955 1955-1960 1960-1965 1965-1970 1970-1975 1975-1980 1980-1985 1985-1990 1990-1995 1995-2000 2000-2005 2005-2010
Country code

344 China, Hong Kong SAR 4.4400 4.7200 5.3100 3.6450 3.2900 2.3100 1.7150 1.3550 1.2400 0.8700 0.9585 1.0257
446 China, Macao SAR 4.3858 5.1088 4.4077 2.7367 1.7930 1.4079 1.9769 1.9411 1.4050 1.1160 0.8252 0.9379
410 Republic of Korea 5.0500 6.3320 5.6300 4.7080 4.2810 2.9190 2.2340 1.6010 1.6960 1.5140 1.2190 1.2284
100 Bulgaria 2.5264 2.2969 2.2171 2.1304 2.1573 2.1927 2.0149 1.9458 1.5527 1.2008 1.2404 1.5005
203 Czech Republic 2.7383 2.3765 2.2088 1.9573 2.2108 2.3588 1.9660 1.9008 1.6455 1.1670 1.1870 1.4286
498 Republic of Moldova 3.5000 3.4400 3.1500 2.6600 2.5600 2.4400 2.5500 2.6400 2.1110 1.7000 1.2378 1.2704
703 Slovakia 3.5022 3.2427 2.9110 2.5410 2.5067 2.4640 2.2710 2.1537 1.8667 1.4010 1.2205 1.3100
804 Ukraine 2.8100 2.7000 2.1346 2.0204 2.0789 1.9798 2.0040 1.8968 1.6208 1.2404 1.1455 1.3828
70 Bosnia and Herzegovina 4.7700 3.9086 3.6830 3.1372 2.7332 2.1900 2.1200 1.9100 1.6500 1.6261 1.2155 1.2845
705 Slovenia 2.6800 2.3833 2.3354 2.2650 2.1999 2.1632 1.9280 1.6517 1.3335 1.2483 1.2114 1.3841

The other thing that I really need to address is the labeling. Clearly we need the functionality to move labels up and down to make them readable. Collision detection, basically. I'm surprised this functionality doesn't exist, because I keep bumping into that. Usually, I can tweak the Y pos by a few pixels, but in this specific case, there is no way to do that.
So, I guess I have a project for 2016...