# How to look like a statistician: a developer’s guide to probabilistic programming

– And we’re back at fsharp Conf.

– Hello. – And I’m here with Evelina.

– Hello. – Awesome. I’m real excited that you were able

to make it here to Redmond in person. – Thank you. It’s exciting to be in the Channel

9 Studios so thanks for having me. – Yes. Yeah, we’re really excited

about your talk. Last time we talked you lived in Cambridge

and I heard you moved to London. – Yes, because I started working

in the Alan Turing Institute which is the British national

institute for data science. – That sounds awesome. So, you’re a data scientist? – Yes, my official title is a data scientist.

– It sounds really interesting. Are there any cool projects

that you’re going to work on? – Next month we are starting a really cool project

with the National Air Traffic Control in the UK where we will be basically automating

parts of their training system and putting AI agents into that. So, that’s going to be a lot of fun. – That sounds really cool.

– So, be careful if you fly into the UK. – And your talk title sounds really cool,

“How to Look Like a Statistician.” – Do you want to look like a statistician? – I do. I mean, yeah.

– I think you’ll switch from the enterprise usage of F# to more cool uses for functional programming. – It sounds cool. So, I’ll let you take it away. – Thank you.

– Yeah, let’s listen to your talk. – So, my title is, “How to Look Like a Statistician: A Developer’s Guide to

Probabilistic Programming.” So, if you ask someone if they want

to look like a statistician, well, am I a statistician? That’s a question. As I mentioned I work as a data

scientist in the Alan Turing Institute and if I ask someone,

do I look like a statistician? They probably imagine someone like this. This is actually a famous

statistician, Sir Brian Cox. He worked on survival models,

but if you also search how statisticians look like, do you want to look

like a statistician? The Internet vision of a statistician

is someone like this and this is a stock photo

of a statistician and I really like this photo

because the guy is wearing a lab coat. Statisticians don’t wear a lab coat. And he’s actually writing on

the screen the other way around. So, hopefully after this talk you

will know how to look like this guy. So, I’ll be talking about

probabilistic programming. So, what is actually

probabilistic programming? You have probably heard

about probabilities, probably, and probabilistic programming

is about creating probabilistic models which is huge area

of machine learning. So, what are probabilistic models? I’ll be using a very nice used case

where I actually used probabilistic models and the data set

that I played with and that’s

Stack Overflow Developer Survey Results from 2017 and the data

scientists in Stack Overflow are really nice and let others

download the data from the Stack Overflow Developer

Survey. So, I looked at the data

and when they published the data. There are a lot of

articles in the news. These were the titles and it got

picked up by major media like BBC. And they ran articles about

programmers who use spaces are paid more. I really like the quotes here,

by the way, “are paid more.” So, I thought, okay,

does it makes sense that programmers who use spaces

make more money than those who use tabs? So, I looked at the data and this is a salary

distribution when you look at the data. And there was a field for people

to fill in their salary and this is the distribution. You can see it looks slightly strange. You can see that there are

two bumps in the distribution. One in the beginning

and one in the middle. So, I thought, “Okay, there is

something strange happening.” Who are the people

who report very low salaries? So, I looked at the people who report

their annual salary lower than $3,000 which comes to about $250 per month. That’s not a huge salary. Most developers reported

this salary came from India which, unfortunately, may be true

because developers in India are probably not paid that well. But the second country that

reported very low salaries is actually Poland and then Russia and then among the first

10 countries there’s also Germany. And then I thought, “Okay, there is something

really strange happening there,” because if developers in Germany report a salary of $250

per month that’s not much. That’s really, really low. So, I decided to look at the data in more detail and this is the distribution

from different countries, from France, India,

and the United Kingdom. You can see that these distributions

have a nice, big bump in the middle. And then I plotted some

of the suspicious countries. So, this is the salary distribution

in central and eastern Europe coming from Germany,

Poland, and Russia. You can see that these

distributions they look different. For example, the Polish one there

are two nice, big bumps. And in the Russian, that’s the green one,

the first bump is even larger than the second bump. And for Germany there is one small bump

and then a big bump in the middle. So, what is happening there? Are there groups of developers

that are paid nothing and then a lot of developers

that are paid a normal salary? Well, what’s wrong? Well, my suspicious was that people

didn’t read the question properly. So, this is the actual question as it

was presented in the survey and it asked: Blah, blah, blah, blah, blah. And even though annual is there

highlighted, it’s underlined, I thought, “Okay, maybe people just report

their monthly salary because people just don’t

read questions properly, especially in eastern

Europe probably.” I am actually from the Czech Republic

and my background knowledge is that when people negotiate their salary and talk about how much they are paid they never talk about

their monthly salaries. They always talk only about their monthly salary. So, they don’t talk about year salaries. And I asked my friends in Poland

and they confirmed the same thing, people don’t ever talk

about their annual salary they just ask about

their monthly salaries. I thought, “Okay, people just looked

at the question, saw the word “salary” and reported their monthly one, probably.” And I will use probabilistic

programming in F# to show you how you can

actually quantify that and find out what’s the real salary and how many people

can’t read questions. So, this is the salary

distribution in Poland. And as I said there are two bumps. They are almost equally sized. So, what can I use to model

the data like this? This is my theory that I will

try to validate using the data. And I will use something that statisticians

call “mixture distributions.” What is a mixture distribution? Well, it’s a distribution that’s formed

by mixture of multiple, normal or standard distributions. So, this is an example of two

Gaussians or normal distributions. You can see that they all have one bump

and they are centered on different values and if we do a mixture we will just multiple them

by a certain weight and sum them together. So, now this is my mixture distribution

where I mixed the two Gaussians with equal weight. I can also change the weights. For example, now the distribution

to the right has larger weight and the distribution on the left has smaller weight and they are about the same height right now and I can change it as well. I can give one distribution a very small weight

and the other distribution a very large weight. So, you can see it, I can use this

to model my two bump distribution that I had for the developers’

salaries in Poland. The only thing that I need to find are the parameters

of the distributions and the weights. So, now we are okay. That’s a lot of statistics. What is this whole thing about? Well, I will talk about how we can use this

in the framework of probabilistic programming. And in probabilistic programming the main

thing is that probability distributions become first class citizens

in the language. Later on I will show you how functional programming

actual creates a very nice framework for this and how we can basically put

in probability distributions and treat them as first class citizens, as if they were our variables. But first we need to look at the

mixture distributions a bit more formally. So, this is how a statistician would

write a mixture distribution. They are many Greek letters. So, what does this mean? I don’t want to go through the Greek letters and see what are the mus

and sigmas and things like that. If I want to look at it

from a developer’s perspective I want to see something

that I can understand and the most straightforward way

to interpret something with probability distributions is through sampling. And what is sampling actually? Well, let’s look at a very famous problem

called the “Monty Hall Problem.” The Monty Hall Problem actually comes

from a TV competition called, “Let’s Make a Deal,” and there are three doors

in the competition and people came there

dressed in crazy ways and they competed

to actually win a car. A nice, old car like this. So, they had three doors

and the question is, so the car, if you want to win, is behind one of the doors and two other doors have something

that doesn’t have any value and traditionally this is a goat. So, in the game you pick one of

the doors and now Monty Hall, whose name is the name

of the challenge and who was the presenter

of the competition, asks, “Okay, so what do you want to do?” I will open another door. So, Monty Hall opens another door

and shows a goat. And now they ask you, “Do you

want to change your selection? Do you want to keep your selection? The door you selected, do you think

the car is behind there or do you want to switch

to the other door, the unopened one?” And normally the probability

that the car is behind the door that you picked originally is one-third because there is equal probability

and after opening the door, do you want to switch or not? If you haven’t seen this challenge

before then you might think, “Okay, it doesn’t matter because

the probability is still one-third.” Well, we will check that. So, after you change

or not change the door Monty Hall opens the other doors

and shows another goat and a car. So, how can we do this

using probabilistic programming? Well, first I will start

with normal sampling. This is my F# code to actually do

a simulation from the Monty Hall problem. So, we’ll just create some types. Type for a door that can be either a goat

or a car and the game is just some list of doors and my strategy is to either stay

with the door I picked originally or to switch to the other door. So, these are my helper functions

to generate a game. And I’ll just run this

and explain it a bit later. So, this is my very simple,

straightforward function to play the game. So, I will just generate some state

of the game, pick a door, and then the host opens

the other door that I didn’t pick and if I decided to switch then I will just choose the other door that the host didn’t choose and I will choose one of them randomly. And then depending on my strategy

I will just say, “Okay, did I win or not?” The actual code is not

that important in this case. So, let’s have a look at the probabilities. So, the probability of winning if I stay

with the current selection, if I tried the game 10 times, is 40% and the probability if I switch

is actually 100% which is weird. Wow. What if I increase the number of samples to 100? Now, my probability of winning if I stay

with my original selection is 37% and the probability

if I switched the door is 72%. What if we increase the number

of samples even more? Let’s say to 10,000. Let’s have a look. Okay, and now the probability of winning

if I stay with my original selection is 34% almost and if I switch it’s 66%. So, we can see that it’s getting more

towards one-third and two-thirds. And as I increase the number of samples

it gets more and more precise. And this is something called Monty Carlo sampling and it’s a very common framework

in estimating distributions. When you want to see the value of a probability

distribution you just sample from it maybe using a very

straightforward sampling like I was using here or a more complicated schema. And after sampling you just a pick

a large enough number of samples and this will give you a good estimate of

the distributions and probability values. So, why should I use functional programming

in this and where does it come into it? Well, the sampling that I was showing

you here was very straightforward. The problem is very straightforward. We can also represent it in a much more interesting

way using computation expressions in F#. So, here I have my door type

that can be either a goat or a car and now I created a new type

for a Monty Hall value which gives me a door

and the probability value for that door. So, initially if I do my original selection

the probabilities are all equal so both goats and cars get a probability of one-third. And after that I will get the different values

after I change or not change my selection. Now, I’ll create another type called “distribution”

that will be a sequence of the Monty Hall values. And here are some helper functions

for creating uniform distributions. And now here comes the interesting

part with computation expressions. So, here I created another type

which will be probabilistic computation. I will call it the probabilistic computation

and it can be either a sample from the distribution or it can return a value. So, this looks quite complicated. So, what does it actually do or what

do I want to use it for? Here I use it to create in my

probabilistic computation builder which is a computation expressions

and I will basically just use the builder to wrap my values in that

probabilistic computation type. And my goal with this is to basically just

record what kind of selections I am doing in the game so that when I use my

probabilistic computation builder or when I create

a computation expression and I actually run it on my data it will record

what am I doing in the computation, what kind of distributions

I’m going through because distributions, remember, are the sequences

of Monty Hall values and which door I’m selecting. So, let’s look at the rest of the Monty

Hall with computation expressions. So, here I created the probabilistic

computation called “Prob” and if you have never worked with

computation expressions, for example, Async is a computation expression. So, I will use it almost the same way

as you would use in Async, I would just call Prob with [inaudible] and now you can see that my

program simplified greatly because now my stay probability

or stay scenario where I just pick an initial door

and don’t change my selection will basically just create

the initial door distribution which is a uniform distribution

over two goats and one car and then I’ll just return my selection and if it contains a car then I won,

and if contains a goat then I lost. And my switch strategy is slightly more –

slightly longer, but still much more readable than my original simulation code. So, first I will just pick my initial door

and then if I decided to switch then I will just look at the initial door and if it contains a car,

if my original selection was a car then I won a goat so I will return a goat

and if I originally selected goat then Monty Hall opened another door

which contained another goat so that means that when I switch I won a car. And the interesting thing here

is that the initial door that’s here, although I created a uniform

distribution now my initial door basically just represents

a sample from the distribution. You can see that its type is door

which is either a goat or a car. And my switch door is, again, either a goat or a car. And here I am basically doing pattern

match on the initial door which is a sample from the uniform distribution. So, the cool thing here is that I am

working with probability distributions, but I can refer directly to samples

that I don’t know the value of. So, let’s actually run this. I have to run everything. And the only thing I have to do is actually

wrap it in my computation expression which is the prob keyword in this case. So, what do I do with it now? As I set my type sample, et cetera and return type, basically just to record what I’m doing

and how the computation is processing the values. So, here I will basically just enumerate

all of my options together with our probabilities and I added some code

that will print what I am doing. And here is another helper function to actually

just compute the final probabilities. So, if I stay, let’s have a look

at what the code went through. So, the code actually went through

all the three different options that I had in my original distribution. So, I could either select a car with probability

33%, or I could select a goat with probability 33%, or I could select the other goat

with the same probability. And now my final result is a car with probability

33% or a goat with probability approximately 67%. So, here I got the values

of the whole distribution. And what if I switch? So, this is what the computation went through. So, in the first option, first I selected

a car with probability 33% and after Monty Hall

opened another door containing a goat I decided to switch,

so I switched to the other goat. So, my first option was the car with 33% probability,

but after that with probability one I got a goat. In the other scenario here

my first selection was a goat. Then after Monty Hall opened another door

I got – I decided to switch so I switched to the car

in the other option. So, again, 33% and then with probability one the car. And in the last scenario, again,

I selected a goat in the first place, Monty Hall opened the other door,

showed the other goat, and then I decided to switch

and I won a car with a probability 100%. So, you can see that my probabilistic computation, the computation expression, actually recorded

what traces I made during the game. If I go to back to the probability

probabilistic expression here, first it recorded what was my initial door selection and what was my second selection

in the switch door. And then I could just go through the whole computation and all the probability values

that it assigned to different scenarios, and compose them, do something clever with them. In this case the probability

[inaudible] straightforward, but in a more general case

or more complicated cases the probabilities would be more complicated and then I would have to do

something clever with them. But in this case it allowed me

to just summarize them directly. So, let’s go back to our mixture distributions. I showed you this equation

which looked fairly complicated. How does it look if we do it in more normal ways

or if we talk about it in a normal voice? So, a mixture distribution of a salary

is equal to the probability that someone read the question correctly and then multiply it

by the actual salary that they reported

or the other scenario is that someone made a mistake, they didn’t read the question properly

and then multiply it by 1/12 of the salary. And this will give me the probability

of a reported salary in general. Now we can see that I have just two

unknown things in this case. So, I have the probability that someone

can read a question properly and the value of their annual salary. So, what can I do with this? As I said, we have just two

unknown parameters. This equation it’s really easy

to compute if we know the values. If we knew the annual salary and if we knew

the probability that someone in Poland can read a question properly then we can compute the probability

that any salary gets reported. How do we actually write that down

as a probabilistic computation? This is my pseudo code in F#. I really want to make probability distributions

first class citizens in my language. So, this is how I would love to write it. So, I would like to just create a salary

that will be a Gaussian with some kind of mean and variance with already unknown parameters. And then mistake will be just a probability

distribution called a Bernoulli distribution which is just a statistical fancy way

of saying a coin toss distribution. So, just a distribution between two values. So, what’s the probability

that someone makes a mistake? And then my observed value will be

just if they made a mistake then it will be 1/12

of the annual salary and if they didn’t make a mistake

then I would just report the actual salary. So, how do we do this

with computation expressions? So, here I have some helpers first. Let’s evaluate them. And now my value is not a goat or a car,

now my unknown values are floating point numbers. And I have two distributions. So, instead of the Monty Hall values

which gave me just discrete values of goat or a car together with

their associated probabilities. I have two distributions. One will be a Gaussian with a mean and variance, and the other will be

a Bernoulli probability distribution. And my probabilistic computation

type looks exactly the same, exactly the same as I had

in my Monty Hall problem. So, again, I have a sample or I have a return value,

and the sample basically just records the distribution that I’m going through and the value and returns our probabilistic computation. And I will use this to create the probabilistic

computation builder, my computation expression. So, this is exactly the same thing I had

in the Monty Hall problem before. And, again, the goal is just to record

what I’m doing with the probability distributions because I can do something

fancy with them later on. This is my model for the whole problem. So, again, it’s wrapped in the prob

keyword in this case with [inaudible]. And the code looks almost the same

as I had on my slide. So, first, I will take the yearly salary

which is a Gaussian with some mean and some variance. And here you can see that its type is the value

which is just my wraparound float. So, even though it’s – I’m assigning it

to Gaussian which is a probability distribution it’s looking like just a sample from the distribution,

just like one value taken from the distribution. Now, my Bernoulli distribution for a mistake,

the probability that someone made a mistake is, again, another value. And what I’m doing here is I’m just checking

if someone made a mistake, if the sample distribution

from the Bernoulli is one then I will return the salary divided

by 12, the monthly salary. Otherwise, I will just return their annual salary. Again, you can see that this is very readable. You can see exactly what’s going on in there. And the cool thing right now is that

in the background I can do anything. I can do some clever things

or I can even do just sampling because I know how the structure

of the computation looks like and what the computation

is actually doing behind there. So, what I will get out of this is basically samples

of salaries and probability if someone made a mistake and if they made a mistake

then I will report one thing. If they didn’t, I will report the other thing. So, what can I do with it? Well, as I said, the mixture distributions and the

whole setting of the problem is fairly straightforward. So, how do we get the actual parameters

of the salaries like the salary mean, salary variance,

and the Bernoulli distribution, the probability that someone made a mistake? How do we actual get the values

of the parameters? Well, this is the complicated part. You actually get PhD out of this. These are actual slides from doing something

similar for a specific model in my PhD in Cambridge. So, this stuff is really complicated

and because this is just a talk on how to do probabilistic programming the cool thing about

probabilistic programming is that you don’t need

to know any of this. You really don’t need to know any of

these equations on how to construct them. The only thing if you are doing

probabilistic programming is you have to know how to specify

the problem and this is it. You just have to pick probability distributions

and the cool thing is it all happens in the background and you let the author of the library or whatever you are using to

figure out this complicated stuff. So, I’ll be talking about the world’s

slowest probability inference engine that I created for this talk. So, the algorithm that I used is

called the complete enumeration and usually you will find it

in any machine learning textbook in [inaudible] under the first chapter

on how not to do things. So, this is my distribution of salaries

in Poland and what I will do, I will try different parameter values, different annual salaries, different means, different variances

and different probabilities that someone made a mistake. So, these are just three examples here. And what I’ll do is I’ll discretize them because sometimes working with continuous

distributions is not that easy. So, I will just discretize them. That means I will create bins

and calculate how much mass is in each bin. And after this I’ll just compute the difference between

the distribution that was observed, meaning the distribution that was

reported in the Stack Overflow survey and my theoretical distribution that I got from

certain values of the means and probabilities. And then after comparing them

I will find basically just – I’ll just pick the closest distribution, the one that looks most like

the data that were reported. So, let’s have a look at the demo, the world’s

slowest probabilistic inference engine. So, here are my helper functions

on how to discretize the values. And here I will just basically

go through the distributions. For example, I know that for a Gaussian

most weight is concentrated between three standard

deviations from the mean. So, I will just take my Gaussian distribution,

compute the standard deviation, and then basically just iterate

through all the values between the mean minus

three standard deviations and mean plus three

standard deviations. And I’ll basically vary all parameters. As I said, I’ll just try

different values for the means, different values for the probability

that someone made a mistake. And as I said, I’ll just work

with discretized probabilities. The code is not very interesting

though, to be honest. And here is the interesting thing. Here I’m actually going through

the probabilistic computation, the computation expressions that wrapped

my choices in the sample or return types. So, here I’ll just basically take

the sample of my distribution and for all the different parameters,

discretize everything and then just enumerate

all the possible values. And I’ll just compute the histograms

for all the possible distributions. I’m going through this code fairly quickly,

but the important thing with probabilistic programming is that you really don’t need

to write this kind of code. The only thing you are interested

in writing is the actual model and then let the engine behind that

figure out how to compute everything. And this is my code to actually

compute to histograms, to discretize probability distributions, and now my code to pick

the most likely distributions. And I will apply it to Stack Overflow data. I have the data is a .csv file

so I’ll just use the .csv type provider and basically just filter out

the data that belong to Poland, and trust the developers

that reported their salary. Now, I got just 317 values

which is not that much, and the average reported salary

is almost $21,000 per year. And the maximum reported salary,

one lucky guy in Poland is basically paid $110,000

a month or a year, sorry. So, now actually I will run my slowest

probabilistic inference engine and find the most likely distribution that created the values

that I saw in the histogram. And it ran fairly quickly. So, actually the distribution that created

the values had a mean of $28,400 per year and the interesting thing is that my probability distribution for the mistake is 0.25. That means that 25% of developers

in Poland, I’m sorry, can’t read questions properly. I’m sorry to all my friends in Poland,

but 25% of people just don’t read questions

properly in Poland. So, this is my estimated distribution. This is how it looks

when I actually plot it. So, you can see that it’s estimated

to bump distribution. One is the distribution for the people

who actually made a mistake and the other is for the people

who didn’t make a mistake. And as I said, the cool thing

about probabilistic programming is that you don’t really have to care

about how it’s done in the background. In the real world there are two types

of probabilistic programming languages. One type is more procedural where, for example,

in Stan you specify your model in a very similar way that I specified it here in F#. And then it gets compiled into C++

and then you run the compiled program. And then the other type of probabilistic

language is where, for example, Anglican which is a nice functional probabilistic programming language enclosure where, again, you specify the program

in a very similar way that I used and then they just do very clever things

with it in the background either sampling or what you can use this kind of structure for when you actually create the program and report what’s happening there

and how the computation is progressing. You can differentiate the program

and then take the derivative of the program and do some cool stuff with that. But as I said, the cool thing is

you don’t have to care about that. That’s something for the authors of the probabilistic

programming language or library to figure out. So, this is the end of my talk. You can ping me on Twitter @evelgab

and I also have a blog at evalinag.com. And if you are interested in probabilistic programming

you probably use computation expressions because that’s a very nice way to actually hide

all the complexities that’s behind there and just take the computation

and do cool stuff with it. Thank you. – Hello.

– Hello, Phillip. – Hi. I am here to facilitate Q&A. – Thank you.

– I think my mic is on. It should be good.

Let’s go here. Let’s see if we got some questions. Well, it wasn’t really a question,

but we did get a statement on Twitter about saying they’re halfway

through this great talk about probabilistic

programming using F# and some pictures

of empty pizza boxes. – That’s great. That’s great to hear that people

are actually enjoying that. – So, maybe –

– I suppose that’s people from Europe who are relaxing somewhere

with a beer in their hand and pizza. – Yeah, maybe there’s something

that can be done in a future version of this talk about probability

of how much pizza you think you’re going to eat

over a period of time and if you’ve maybe misread

the question or something like that. – Yeah, yeah, yeah. Or if people reported their pizza

consumption in kilograms or ounces. – Yes, and then the survey respondent

just takes those as raw numbers and says, “Ah, well this is the amount

of pizza, unitless if you will.” So, there were quite

a few insights here. I think my favorite was definitely the people

are probably not reading the question correctly when they’re reporting salary information.

– Yes. Well, that’s actually an interesting

part of being a data scientist because sometimes

you have to deal with people just not behaving the way

you expect them to. – Mm-hm.

– Because the people come into play as well and sometimes you just

look at the raw data and do some conclusions from there

and then it doesn’t make sense. – Interesting. So, one thing that I really liked was – I really liked representing

this complex stuff in the form of a computation expression because I took calculus

and differential equations, but that was a long

time ago in college. I can’t really read that anymore,

but I can read computation expressions. So, I’m kind of curious like what are

some of the things that go really well? What are some things that go well,

and some things that may not go so well if you want to try to model this with computation

expressions or something else in F#? Because I’m kind of curious. Because this looked great, but I’m curious

if there are other things where it’s like, oh maybe a different approach

in the language might be better. I guess, I don’t really know

what that would look like though. – Well, the cool thing with this kind of expression is that it basically just records

what you are doing there. Because some people, for example,

like monads and things like that. This is not a monad, my computation

expression here because the types don’t really match and the only thing it’s doing is basically just recording

what we are doing in the computation. And then you can take that and do anything with that.

– Right. So, your goal is not necessarily composing this

and this and this and this or something like that, just more of like a nice way

to represent the work and then do something with

that representation later? – Yes.

– Okay, I see. – I know there are some papers on basically

using monads to represent probability distributions, but that’s usually not how probabilistic

computation or probabilistic languages work because they want to do multiple

different cool things in the background whereas in monads, for example, there are monads that use composition to compute

the probabilities, the resulting probabilities. But that’s not how you do it in practice

because that gets very complex very quickly. So, normally you just record that and then use

some kind of optimization engine in the background. – Okay, okay, cool. Yeah, that’s great. So, we have a question there. Why not use R for statistics? – Why not use R for statistics? Well, that’s a fair question. The thing is, for example, there are not that many

probabilistic languages in R either. The thing you can do in R, for example,

if you have a more complex problem than I had here is you basically specify

your program in like a Stan language and then you compile it from R and then you can use

the compiled model from R again. But this doesn’t actually

give you much advantage. The thing is, well, R is not

very efficient, to be honest. So, usually whenever you have

anything efficient in R it’s using some other inference

engine in the background. But from R mostly people, if they use

probabilistic programming they use Stan which, as I said, you don’t actually specify

even the model in R, you specify it the Stan language. – I see.

– Which, basically, the model specification looks

very much like this, basically. You specify some probability distributions,

and then you operate with their values in some way. – Would you say that R for something

like this might be useful to prototype something really quickly in R? – Yes, you can do that, but it depends on what you are doing.

– I see. Do you think it might just be a lot of work

to translate the thing that you built into something that’s actually going to run decently? – Yes. Yes, that’s always the question. But with something like this here

I chose, for example, a Gauss distribution because that’s the simplest to use usually. You can use different distributions,

for example, a Gamma distribution, but even the Gaussian

works pretty well. So, as I said, this is a very slow inference

engine that I had in the background. So, this is more sort of for a demo purpose.

– I see. So, there’s ways that you can write it

to make it much more efficient. – Yeah.

– Okay, cool. The code might not look as nice. – Yes.

– That’s how it always goes. – I just thought this is a very cool way to represent probabilistic computation. – Yeah, I really like it. I think that’s fantastic. So, what are some – oh,

it looks like we’re out of time here. I was going to ask one more question.

– Oh no. No, I want to talk about statistics. – Evelina, I guess you can sort

of mention where you are on GitHub, Twitter,

that sort of stuff. – Yes, just ping me on @evelgab on Twitter

or just find me on Twitter or on GitHub, anywhere. I think I’m the only person in the world

with my name so just Google me. – Very easy to find.

Excellent. So, next up we’re going to

have Ody from Lagos, Nigeria speaking with Tomas

about how he came into F# from just learning the language

and kind of transitioned over the course of a year from being an absolute beginner

to making a contribution to the compiler. So, that should be a very

interesting discussion. – Thank you.

– Thanks.

A revolution without dancing is a revolution not worth having.

This is really good! It was so hard to find a relatable video on probabilistic programming! Thanks for a great vid!