Updates: Tyler on PowerHacking CSS to cutestrap your app in this latest episode!

E5 - Applications on the Real Internet


App performance on the real internet presents real challenges, especially with real time or latency sensitive applications like video streaming, but the Magic Modem can help us.

Build a video streaming service with #javascript and #python from scratch with an ex-Netflix engineer! Today we look at how apps perform on the real internet, starting from the fetch(URL) view of the world to the basics of networks before adding some app metrics and tuning a real world task. Practical teleportation is compliments of manaOS / the Magic Modem, to see streaming video play on networks around the world! codingwithsomeguy.com is #livecoding on Twitch, Mondays and Fridays.


Transcript:

Oh everybody how are you doing here we
are episode 5 the real internet this is
CSG flicks the series about building
your own streaming service from scratch
it's mostly been live coded and boy have
we got a show for you today so there is
a lot to talk about I'm going to
actually get myself the glove because
we're gonna do a bit of drawing since
we're gonna be covering mostly stuff
about the internet there's gonna be
probably less code in this episode then
a lot of the ones previously because
this this is really about kind of how
global networks work and to kind of like
build a better model for everybody of
like you know what it what it really
looks like I want to go yes and mark yes
good good nice yes it totally is visual
ASMR but no sorry that's that's his
patent so I not not stealing it from you
man
very funny last night by the way so just
to be period specific I thought you know
if you're building a streaming service
let's wear the t-shirt so we have I have
the hack day shirt on I've got the
coffee cup we are all swagged up to no
end so this ought to be interesting so
we're gonna we're gonna basically run
through well there's a lot to talk about
so why don't we just kind of kick this
off and get going let's take our app out
to the real Internet
[Music]
all right yeah that's it that's episode
five that was from the original first
first episode actually that was that was
the overview so we are here today we are
doing we are rocking a five this is all
about real and I'm gonna give it a nice
nice quotes here because it's seriously
real Internet alright so this is we want
to basically kind of go beyond the you
know kind of typical like alright I
fetch and I do a URL and magic happens
so I guess that's a magic would be a a
top hat and here's our rabbit coming out
of it but so basically this is this is
going to be kind of fun so there's we
have a bunch of different topics to sort
of go over here and we're gonna focus
mostly on as it pertains to a streaming
app so this is the CSG flix so we're
basically we're worried about the live
case this applies all to regular web
apps it's really the the stuff that I'm
covering really is is generally
applicable but it helps to have a little
bit of a model about how it works I
don't want to spend too much time and I
mean there's there's tons of like books
and talks you can go to so we're I'm
gonna try to keep a bunch of that simple
but just sort of like what are like the
key points as pertain to our particular
use case and for that I'm gonna switch
over it blue so we have this sort of
browser view of the world and in the
browser we basically just pop the URL
and then you know our web browser does
some stuff it goes over to the cloud and
then some server somewhere or says oh
yeah sure that's ok cool all us and you
back that and we're all good to go this
is a nice simplification and it
unfortunately doesn't help you very much
when you need to diagnose like why
is my web app like why is it not good
III don't know what's happening here I
don't know what's happening here
I don't know entirely what's happening
here but it mostly looks like what was
in the lab except in the lab I mean
everything just kind of worked I mean I
just had the the server right here and I
had my my browser right here and and
everything worked fine until I got to
this thing like and and once I got here
as I described in the first episode and
there's actually a in that this is kind
of where most companies break like this
is this is the bridge too far so what
we're gonna do here is kind of
understand a little bit about like what
is in this cloud and really it's
actually just composed of a lot of
computers just sort of like your browser
computer and they're really low end
they're they're these actually we were
to use that kind of let's let's let's be
done there so they all kind of like
connect to each other and typically
these are just you can think of these
servers they're really just low-end
computers most of the time like
especially unlike you know the kind that
you're gonna have for your Wi-Fi or you
know whatever and the key thing about
all of these computers is that they have
two interfaces so this might connect
over like you know some long copper wire
you know going back to your house and it
might connect to another one of these
computers over some fast ethernet and
then that one's connected to this one
but that one happens to have three
interfaces so it's also connected to
this one and really the way that your
message eventually gets through to your
server is by passing across these
interfaces so I basically take my
request you know which let's just say
this is again and it comes down here
it's sort of it says all right is that
local to my local network
well no not by the time I'm getting to
like a real server so it goes well I
have a path out you know to another
interface so I'm gonna kind of go along
there and then it asks this computer
alright do
which way this this server is like like
where where is this you know well I
don't know but I know that he knows and
then actually I don't know why these
have a he gender but it knows and then
this this particular device is okay I'm
connected to a couple people here so I
know that that one is closer than going
over this one this one might be
connected to but that's not a great path
I think I think you're better off kind
of going that way and then this one says
okay that's fine I'm just gonna kind of
head all the way over here all of these
are referred to as hops
so we're basically kind of jumping
through the network from you know and
this might be in my diagram here I've
got one well depend only on what your
starting point is a hop but I've got one
two the home router two three four five
now on a typical Internet route we're
probably looking for somewhere around
like ten to thirty hops this is kind of
rough this is going to be a bad
connectivity situation and this is going
to be pretty good so a lot of lag is
usually related to how many of these
hops but it really also has to do with
like how busy some of these links are
like like this particular link might be
super busy and that means that a lot of
traffic is going to kind of get like
stopped here and you know it'll
eventually make it over here but you can
kind of think of this like flowing water
like if you try to pour too much water
through a small hose then it just it
doesn't go faster at some point so this
is this is kind of this is this is I'm
just gonna hand wave some of this but
these are typically called buffers and
these buffers they sometimes get really
big and that's that general problem is
gonna refer to as buffer bloat but
well-well that's that's a little more
than I want to get into today so we're
not going to worry too much about this
what we are gonna worry about is that
each of these hops so like I've got hop
one two three four and eventually my
server each of these hops is in a place
like it we don't think about it because
all of this happens so quickly like you
know typically from end
and we might even if we're traveling
across a good part of the world that
might be like 200 milliseconds for each
of these to travel now that seems like a
lot to a gamer who's probably gonna
prefer things to be more like you know
ten milliseconds but you know because
every bit of like delay you know might
might affect your next shot
whatever kind of game you're playing but
um this is kind of amazing for anywhere
that I might be to anywhere where the
server might be now obviously having the
server closer to the user is going to be
better and then we're gonna get into
that a little bit later but this is
basically a very simple model of sort of
how it happens so when this response
comes back it's typically like a larger
response and you know this might be on
the order of like you know a few K on up
to a few Meg's up on up to like gigs if
I'm downloading a big file but more of
the story is that we don't really know
exactly what the scale this is going to
be it depends really on what sort of
traffic word we're dealing with and all
of these devices along the way they need
kind of more manageable chunks to work
with so those this message gets divvied
up and this is the packet view of the
world we call those like a bunch of
different packets so here we got packet
one two three four on however many
packets can be sent that are needed so
you can kind of think of this this is
more like if I imagine this as kind of a
envelope model of the Internet and each
one of the messages I can't put you know
for one stamp I can't put too much into
one envelope but if I want to send a
bunch I can I can send a bunch of
different ones or if it's something
bigger then it might travel slower so
that that's not really the way the
internet works so we're we're not gonna
worry about that type that scenario
we're gonna assume that every message
because it's typically just bits ones
and zeros we can we can go and break it
up it's subdivided into these little
chunks and these little packet chunks
they can travel around and sometimes
they get lost some like maybe a bunch of
these come back on that until we get to
packet for which happens to you know
just get sidetracked and
you know it goes and has a beer and
that's that's unfortunate because that
means that packet four is where are you
where are you packet for I I don't know
where packet for when so we want to
worry about that as well that's gonna
that's this is typically referred to as
loss and it's it's a little bit
manufactured because we don't actually
know if packet four is just working its
way around a non optimal path and it's
just taking too long or if it really
just went out for happy hour so that
that part we're never really going to
know but we just sort of like
intuitively refer to this whole idea as
loss so we're gonna we're gonna try in
that case we're gonna read transmit and
we're gonna say hey I lost four and then
you know this is gonna go here's for
duplicate which is really gonna be like
packet twenty or however far ahead we
got so that's in a nutshell that's how
reliability works in TCP I don't want to
go too much into the actual protocol
specifics because the important part for
us today is going to be these servers
and you know they're well they're
routers but they're really generally
low-end computers some of the ones
towards the center of the internet tend
to be really really fast and really
really high performance but the stuff
that's out at the edges tends to just be
like a really basic computer that says
okay I've got something here and I need
to send it out over here alright great
that's in a nutshell you know how the
internet works so when we start studying
it we can kind of say like what path did
all of this go and the way that's
typically done is by a starvation
technique so the way the way we do that
and this for those of you familiar with
traceroute this is this is the gist of
kind of how it works
it basically says ok I want to send to
here now there's no direct link between
the two so it's going to need to go here
here here here
and what it does along the way is it
says ok
send to this accept only go through one
hop and so then it goes here and then it
basically this says that too bad because
and that's I can write better than that
so too bad comes back and says yeah it's
not one hop but when I find that answer
I know who this is now and if I send out
and I say okay let's go two hops then
this is going to come back it's going to
get to here and then this one's gonna
reply too bad and send it back and now I
know who this one is and if I keep doing
that all the way until I get to the
final destination you know here cool
then I can actually kind of smoke out
who all of these different routers are
along the way and if I go and I take a
quick guess based on you know their IP
address I can say like where are you you
know and I can I can geo locate them and
I can build a rough map depending on how
good I know where this particular IP is
located in the world and if I put that
all together into a visualization which
I'm going to show you shortly
then I could build some pretty
cool-looking and that's what I refer to
as the spy maps because they they look
very much like one of those spy movies
when you know a message it's like oh
they got our link they've got us in
Delhi no no no they they've caught us in
the Caymans you know and you could
always see like those lines going across
the globe like that's we're gonna try to
build something like that okay so that's
me if I know where the IP is located so
I have a big master table oh and sorry I
skipped over something how do I know who
all of these things are well I need an
address and that's the same as your
normal notion of an address like okay if
I'm if I'm living here and I'm in
Redding Connecticut and you know I'm at
this house and you know whatever like
that's that's a good place to be now
this is more like a phone
in terms of its like it's like you're
number one you're number two you're
number three
except we want to sort of break these up
so that we know which these these these
blocks are sort of owned by these
routers so if I have something like you
know let's say that like this is 100 and
I'm using IP for here on purpose because
all these concepts are going to
translate later to ipv6 but this except
this this is going to be a much much
bigger address space we sort of learned
that the internet was gonna be very
popular so great I can take this
particular address now this the net
block usually starts with a 0 so I can
take this particular dress and I could
say this is the first one in that block
so this is really one zero zero one all
right and this might be 200 dot one and
if I have a master table which sort of
says like alright you know what's the IP
and where is it then I can start to
build this type of spy map so I can say
100.1
is in New York City all right and 2.0.1
is in Chicago and as long as I have a
very complete list then it becomes not
too hard to put these onto a map so for
today we're going to be kind of
borrowing a few lists that are available
and I'll cover some of that a little bit
later we're that's that's that's the fun
part okay so how do we know which of
these blocks is in which place
well these routers they're in
collections so when I have a particular
collection and we're gonna we're gonna
say router collections or more of this
Brown so mostly because I'm thinking
about coffee right now this one this one
and this one are all covered by one set
of network people but this one the
one and this one so this might be like
you know in the in the New York example
maybe this is Verizon or you know in the
Verizon might be hooked up to say
Comcast or you know maybe maybe Amazon
which we're going to get to a little bit
later is you know another collection and
so all of these they sort of manage
themselves within this organization so
AWS says okay we handle these ones
Verizon says okay we handle these ones
and Comcast says we handle these ones
what do you call a collection of routers
this is our collective noun so you know
if you have a bunch of geese you have a
gaggle of geese if you have a bunch of
crows you have a murder of crows well
what do you call a bunch of routers an
ASN of routers and that is really just
kind of you can think of this as just
sort of a grouping of a particular set
so this might be a sn1 it's not for
Verizon but we'll get to that a little
bit later and this might be a sn2 and
this might be a sm-3 and they kind of
agree to exchange routing information
with each other at various points now
I'm not going to go into that protocol
but typically that's done over protocol
called BGP and that's important for when
we want to say who owns a particular IP
block so if we say something like 100
dot o and a bunch of addresses from
there which I'm just going to use the
shorthand to that and 2.000 and a bunch
from there this might be owned by asn.1
and this one might be owned by a sn2 and
by the way I'm using the word owned very
loosely here because they're actually
just leased for well I mean no one
really owns the block it's more like
it's registered to a particular company
but I'll just use the term owned a lot
anyway so and they exchange routing
information and then basically using
that routing information when I go and I
and I'm here on my browser
and I want to connect to some server
somewhere that's pretty much how these
all know where the different blocks are
like well I know that I can get all the
way I need to get to this I need to get
to ASN I need to get to a sn4 which is
where that server is so the way you can
get to four is you can go through one so
two four you can go through one - two -
three - four or I can go from across so
that would be kind of going this way
otherwise I could also go from one to
three to four which would be a nice
shorter path if it's available and the
event that there's like a link issue
maybe I have like Network five you know
so if this is a SN five maybe I can come
let's say that there's an outage over
here then maybe I could go this way
through five and then over to four so I
could go one two five four and I want to
try to choose the path that's shortest
so I could choose its shortest based on
this sometimes I might choose different
ones like depending on reliability or
depending on load like these are sort of
changing all day you know like that this
this might experience a really high
volume at evening with like more cell
usage this might experience really high
volume at evening with streaming service
this one experiences high volume all day
so these pads are sort of like always
changing
all right great that's enough about kind
of the theory of that so what what we're
gonna do here is like I I really want to
take a look at how this stuff is really
built I want to get away from these sort
of theoretical number so I'm gonna
remove the ASMR and glove that kid CAD
mentioned and hello miss Sagittarius how
are you and let's go let's go to our
practical stuff and and start
actually getting into this alright so
let's take hello how's everybody doing
um I'm here I've got my oh and that's my
window wants to be a little bit bigger
all right so we have our CSG flix and we
have our globe and lurking is totally
fine don't you lurk away miss sachet
areas we have our globe here and it
seems to be running locally and by the
way there's a few conventions that I
sort of skipped over obviously there's
no place like 127.0.0.1 which is home so
that's just always a way of referring to
yourself but and and and routers kind of
know that they say oh you know what
that's that's not something I should
send out on the internet so this
particular app happens to be running on
my computer but we're gonna we're gonna
get to that in a moment because that's
that's something that we're gonna be
able to to see so how do we find our way
from where I am right now using that
trick of sort of like starving the route
and saying like okay only go through one
hop only go through two hops only go
through three hops and get the error
message back and say all right what what
hops were there in the middle well
there's some very nice tools that do
that the the most commonly like the
biggest widest oldest ranged one is
called traceroute and I can actually
trace route just about anything I could
say all right what's the route to you
know let's say I'm going to Google and I
don't even have trace shot on here
because I don't like to use it because I
prefer Matt's trace route which is a lot
more fun and also works a little bit
more quicker so this is using that
starvation technique so it's showing hop
one here this is open wrt this is my
router downstairs then it's going out
and they're using a 10 address on cable
vision optimum and there then that hooks
over to something called light path and
then we go this is still on the optimum
online network and then we're bouncing
through these a little bit and we don't
really know where they are but we do
know and this by the way this is showing
the times across all of these different
things and we can see some of these
numbers are sort of impossible because
if this is the time it
to get to each one of these hosts and
I'm just gonna stop this because we
don't need to keep updating like that
which is a nice feature of MTR some of
these are impossible because this is
telling me like from here where I am you
know which would be hot zero I guess to
the other end is 14 milliseconds on
average right the last one was 12
milliseconds but sometimes these ones in
the middle they take a longer time how
does this take 15 milliseconds to the
seventh hop when it gets all the way to
the other end in 12 milliseconds I mean
so these numbers are a little bit funny
and that partially has to do with the
way that this traffic gets massaged Don
the internet but what you will see is
sometimes you'll see really big jumps so
in our case I want to trace out to a
server we're gonna be using a little bit
later that's over at AWS now it's not
answering to pings so it's not going to
come back at the other end but you'll
notice that there's a few big jumps and
those pretty much correspond to where
we're going across a lot of physical
distance this this sort of hits you know
and network engineers will refer to this
it's sort of something related to the
speed of light because at some point you
can't travel this this particular server
happens to be on the west coast so you
can't travel from the east coast to the
west coast in an instant I mean it's
well at least we haven't figured out how
to do it yet Star Trek has so but in the
middle you might get a little bit of
noise so don't don't believe these
numbers too specifically but they're
sort of generally right except when you
see this type of jump and now this is a
host that's not telling us something
because we didn't get a reply from it
but we did get a reply from the one
after it so MTR is smart and it knows
okay there's a host here it's just not
answering us this this is the other end
and obviously between here and here we
cross the country I mean we can tell
that just on the speed of sort of the
transit so this happens most most of
these routers tend to use airport codes
so that's not exactly writes SJC for San
Jose but that's that router is almost
definitely located in San Jose based on
its name and that's you know managing
these large networks these large AAS
sends of routers is a bit of work and
you'll occasionally lose them so a lot
of them have some sort of Geographic
hint like this one also s Jo San Jose
that's fun because they both correspond
to the right sorts of times too so I can
see that this these times pretty much
map up with you know where it's probably
going across and this NYK is probably a
reference to New York I'm gonna guess
this is probably White Plains but again
that's a little bit of guesswork
involved there now this one is called
router eight so I don't know why we're
using eight not seven but that's that's
up to that particular autonomous system
and this is this would be on the cable
vision network so these are all
definitely on the same network this one
might be a transit network and that's
how we're getting across the country so
that might be sort of like that ASN I
showed in the middle and these are
almost definitely in San Jose and the
times pretty much match up with that
well wouldn't it be fun if we could kind
of see where that is so I show you that
in a second but before I do I do want to
show you that this map is actually
interactive and it is something that you
who came today get to play with ID I was
working hard for you for this for this
episode so unreleased to anyone else
other than the people here I would like
to share with you this link so don't
everybody hit it at once and if you're
concerned about your IP address possibly
being revealed it might show up in a
blank but generally speaking this is a
server I set up for today and it's
important and I think that link didn't
get it right but it's really important
that we hit it on port 30,000 so that :
30,000 had better be in the link I did
not I intentionally did not want this to
be easily found by people just sort of
scanning around on the internet so you
might have to manually copy that link
instead of just clicking it so I'm sorry
about that but
I know someone in chat is died oh look
at that somebody in over in the UK just
hit this link and look it turned my
hands were not there so that's sort of
the fun thing I I did this in three and
this globe for those of you who checked
in with the map session last Friday that
was the 2d version this is way more fun
in 3d so hello puppies I'm guessing pie
peas is probably the one who actually
hit this link based on where I think
that person is but yeah hitting this
link will actually cause this thing to
spin just in case you know and I added
some country codes in here too it tries
to target sort of a notion of where the
center of that landmass is because I
don't have any data that's more specific
than that this is just very very coarse
Geographic data it's not the kind of
this is not marketing data it's not like
the kind of stuff that some large
advertising g-type companies might have
oh look at that somebody it looks like
over in Denmark just hit it it's kind of
a fun visualization so if sunny was here
I'm sure she would hit it and we would
see something over here you like that
nice animation ease you know I'm gonna
take some coffee just appreciate that
for a moment so this is really kind of
fun you're you're not directly hitting
my computer here you're hitting a web
server that I set up on online ode and
line ode is I basically have a WebSocket
hooked up to here so that that's and he
has miss Sagittarius I'm very glad that
I have some coffee I again went on theme
for the mug today so that's it's old
logo it's old school but cheers to
everyone so that'll be there you can
play with it now how do we kind of see
like what that path is so we have this
sort of trace and it would be really
nice if we could do things like put this
Norwalk Connecticut which is not too far
away from me on the map and then from
there connect to whatever this is
is if we know and then from there draw a
line over to White Plains New York and
then from there still stay in New York
maybe go to New York City if we don't
know exactly where we're not worried
about super precision we don't need to
know exactly where the router is we just
kind of want to get a general sense an
intuition about how it travels and and
then of course it'll sort of traverse
the country and go over to San Jose
well I wouldn't did that for you so the
first thing you're going to need to know
in order to do that and the way that
this information works is it basically
takes an IP address like this one and I
built a small little IP to ASN lookup so
this is it looking up Google common
8.8.8.8 DNS address that a lot of people
use and we can see here and this might
be a little bit small subs blown up this
is this is the number in normal ipv4
dotted quad notation this is it it
actually is just a 32-bit number so this
is actually it as a number this is the
autonomous system and that's just a
collection of routers so this is in
autonomous system one five one six nine
and one five one six nine is registered
to Google and Google filled out on their
registration that they were in the US
and that's actually the information I'm
using to target where this happens to go
so and if I did that this would be like
you know it's the this this server is
almost definitely based on ping time
that we did earlier this is it's almost
definitely in New York but it just
because I'm I don't have that precise
geo-targeting so I'm just sort of
centering it on on that and this is the
particular block that I just told you
about so this is the you know 8.88 and
then this slash 24 tells you how many
addresses after that so that basically
says that this last did this last number
whatever number is there it's it's in
this is N and so this particular one is
here so if we did that with something
that showed up along the way like this
unlabeled 65 19.1 21
1:22 oops let's do that then this is
actually the default rule so anytime it
doesn't know where it is
it sort of defaults to this George
Washington University and that's that's
kind of like the default I don't know
where it is I need to give it a place
and you'll see that on things like for
example if I looked at the home address
one two seven zero zero one that's gonna
it's not really at George Washington
University but they have the default
prefix set up right now so that's that's
more like a an artifact of the data and
hello Don Ho how are you so and that'll
be true for other things like ones some
of you might be familiar with one 92168
these types of addresses they're RFC
1918 they're all about not being
assigned so they're never going to be
assigned to any particular provider so
they'll all just sort of default to this
George Washington University address and
that's true for the ten grass you know
that again it's they're just artifacts
so like this one for example is an
unassigned address so that's not
publicly routable on the Internet
so okay yeah doing good just just
getting warmed up we're getting in the
network session Don huh so this is so
basically this this table and if you're
interested in this I put how to do it on
my blog so if you've got a blog of some
guy come and I don't have a lot of stuff
here but this is kind of a bit about
what we're talking about today and this
this this article will walk you through
how to actually build the table that I'm
using so I'm using exactly this IPAs
nmap in this pieces offer so this is our
course rough geolocation I figure if we
don't know where a particular IP address
is then we could at least target it to
the country based on what's registered
for the ASN that
and that that'll be very coarse
precision but you know it's kind of fun
like if I did like 1.2.3.4
that's not assigned here let's do let's
do a level three address so 4.4.4 to
four is from level three so at least I
can look on this see that it's a SN
33:56 and that's in the US probably
although level three is a global
provider okay um although it's owned by
someone I think they changed that a few
years ago anyway great well that's fun
this I'll leave this up for anyone who
wants to play with it including Don Ho
who just got here so great if we use
that information and we get a little bit
more precise you know by you know maybe
reading some of these and saying oh this
is really in White Plains or this is
really Norwalk Connecticut and if we
actually had if we built up a longer
term data base and used a higher quality
data set we can actually start building
maps of all of these paths and that's
what I did so this is your CI a view
that I that I mentioned you're gonna get
some spy stuff today well here's your
spy maps so some of these are what I did
was I took the people who were hitting
this website and it's mostly probably
scanners but this was from the coding
with actually no this is from coding
with some guys so there's actually a
bunch of legitimate hits here so this is
just people that go and hit coding with
some guy comm I took some of that
traffic and pumped it into some higher
quality data which got down to more of
the the city level and built these map
visualizations of how it was traveling
through the internet to get to all of
them and some of them are really crazy
so you can start off here this one is
looks like it's in Taiwan and before it
goes straight from you know this is
they're all going to start from line ode
so this is all from the point of view of
the line node server that's hosting that
website lie nodes not directly connected
to Taiwan at least by the path of this
chose so instead it probably went out to
Chicago bounced down here bounced
up here and then went far this so some
sort of cable now this is a very
unsophisticated mapping technique I just
drew lines between the locations but in
reality this almost definitely went over
the Pacific instead of probably the way
that it looks like it's going here but
so there's imprecision with this but I
thought it was kind of a fun rough
visualization of kind of how much we do
know that there all of these hops exist
where their particular locations are
it's you know it depends on the data
quality of how you have your IP location
stuff so sometimes you know line ODE
happen to take a detour here through the
UK before it got to this particular
server I don't know why that happens
some of these get really fun this is you
know this is your it took this these
particular sets of packets went across
the country and then they sort of came
back and then bounced around and then
bounced around some more it eventually
got to a server over here so we don't
really know exactly which way it's gonna
go it depends on how those those paths
go that I described earlier but it works
remarkably well which is kind of amazing
and sometimes like I mean it just gets
really into a twist this I have a
feeling is just probably some bad
geolocation data about you know where
the particular routers are stored but
yeah like this one I'm pretty sure as an
artifact I don't think it really
traversed the Atlantic that many times
but some of these are quite fun like
here's some of the ones over in Africa
that that came by and they bounced
pretty good to it this was probably a
client in South Africa that was
accessing the website so anyway I built
this I thought it was kind of a a lot
more fun to look at something like this
even if it's less precise than it is to
look at something like this you know so
for this particular route these take a
little bit of time to collect so I'm not
gonna show them live I mean basically
you have to do this trace out and by the
way I used MTR for these so you have to
do this trace route and then for each
one of these addresses you have to go
and look them up like basically say
what's the geolocation of this IP what's
the geolocation of this IP and I'm
showing it to you in name form but if I
put it into the form that I was
using then this is kind of more of what
it looks like you know so I can get
these IPs and look them up one at a time
and I'll end up building a map if I
throw it all on to them at all right
well that's kind of neat so you get the
2d and you get the 3d you know so I
think the stretch goal may be in a
future stream for fun is we might put
those on the globe oh it looks like
somebody in America hid the globe while
I was looking at what hey all right
great so that's kind of how the Internet
travels it's all these routers connected
to each other it's it's constantly
communicating paths across ASNs
collections of routers and the path that
it takes I don't really know I mean like
if I'm going to like amazon.com for
example and that's kind of the wonder of
the Internet is that I don't really need
to be concerned too much with how all of
this works because it just works most of
the time except when it doesn't now this
one again they usually use airport code
so it looks like this one's located in
Newark that sort of matches with the
speed of light time calculation I know
that this isn't on the West Coast based
on you know how short the time is okay
so well we can look at traces for a long
time we can get kind of a sense of you
know all that but what we really want to
do is we're building a streaming
application we want to build a streaming
application that works on top of all
this and for you it might be just a
website it might be for kid cab and a
visual ASMR it might be actually yeah he
has some kind of fun he was doing some
CSS yesterday so I Oh CSS you can see my
idea of styling I'm like ah just
throwing it a tag call it
alright the border won't be quite right
I just it's so much work to get that
stuff right and Tyler if you're around
you're you're amazing at what you do so
all right
what we're typically going to get in an
application is we're probably going to
be pretty far for
our particular server and that'll be far
depending on where the user is like so
for example if this user in Brazil is
trying to get to the server it this is a
pretty far path in terms of traversing
networks now even even with the
imprecision even if this is relatively
straight line just the number of hops
that this has to go through before it
actually hits this particular server
which actually I know is located over
here this is gonna be a little bit rough
and that's typical of your application
servers so wherever you're doing your
application logic like you know in CSG
flix I had the player sorry the the the
UI in the previous episode where we were
using a controller and we were sort of
out running we were doing our search
traffic and the search queries every
time I would run a search query by
typing another letter it would have to
travel all this path and then get all
the way back and then when the user
finally selected a video it would still
have to do all of this and this would
get really bad really quickly so
basically the answer was people came up
with CD ends and they said why don't we
create a bunch of servers that don't
really have much logic on them so you
don't put your code there so much as you
put just your static files so in the
case of our video we might put our video
files on the CDN much closer to where
our user is and put our application
logic on a server which is further away
and that's part of the problem with
using a lot of the existing bandwidth
tools they assume that those distances
are the same but in reality CBN's are
typically very close to the user and
application servers are typically far
from the user so that's what we have to
kind of engineer around and that's a lot
easier if you can see it and it's really
amazing every time I've seen with a
developer they're like well I'll just go
to Brazil and program there if you can
swing that that's that's a valid
solution you will experience the network
exactly as it is if you go everywhere in
the world and I think that's super cool
but assuming you don't do that or if
that's cost prohibitive then we might
want to be able to teleport and simulate
it and have some level of precision
around this difference between CDN and
Atos we want to know that whatever we
simulate it more or less matches reality
and that I'm going to show you shortly
because I got something that's gonna
help you with that but so kind of where
we left that was I I had the and this
was this is pretty much
I had a few check-ins that I thought
were not very interesting for the stream
but they these chickens were really
about just just kind of cleaning some
stuff up with the with with the code and
and separating I wanted to get ready for
CDN use so in the previous episodes
ahead I just assumed that the CDN server
and the app logic server were the same
so I just tease them apart and I checked
all that in for you on github so if you
go to github coding with some guy and
the event that you want to follow along
in a little more detail later on
probably then you can find this on the
latest version of CSG flix and here's
kind of the cleanups that I mostly got
in yesterday for for today and and I
added a controller last last time we had
the controller and we were doing stop
and I was like I need add some controls
for that but this was really about
separating the CDN from the app logic
server so I'm not gonna go too much over
those changes they're available here if
you want to see that you can check it
out over there so great this is the code
that I have it's this is basically up to
date so if I ran this thing and this is
kind of where I got it started now I'm
running something on this particular
port right now which is this geo
simulation but but
I think I add it yeah I did I added a
command-line argument for that which is
what we're gonna use right now so let's
put it on another port let's let's go on
port 8000 all right so this thing on
port 8000 this is sort of where we left
it now this was with AWS and the CDN on
the same server which is unrealistic
once we go out to the wild and that's
that's I'm gonna show you that so here
was sort of our elrod navigation and
when we were doing our coffee look how
nice and fast an instant it's so
beautiful when you do this stuff in the
lab you know because it just it just
immediately loads like there's no
problem you see like I type a search
query it immediately comes back and if I
go into this you know and by the way
minor change here I added this is the
normal video player scaling with some
scaling which we'll get to in a little
bit but this is the the actual size this
is the one that you saw in the previous
and this is the Delta so this is showing
the difference between each frame in my
video so but that's all you can watch
the earlier episodes if you want to know
about that so if I go and I plan I added
a focus so alright so I can see this
happening these two are pretty much
gonna be identical or they should be
identical and this is the Delta this is
showing the pixels that are changing
from frame to frame so there's me
drinking the coffee from one of the
earlier episodes and this actually this
is the run length in case this is the
compression episode okay and we added a
stop button so we can actually stop the
thing great well everything works fine
ship the product it's gonna work exactly
like that when we hook it up to the real
Internet who thinks that's gonna be the
case should should I ask for a poll does
anyone think the real Internet's gonna
look anything like what I just showed
you after showing you all of these
different maps of how it's gonna
traverse the world like no way I'm gonna
tell you right now
way the real internet doesn't look
anything like this and that's
unfortunate because when I programmed it
I was programming it here and so I don't
see a lot of the issues that my users
are gonna experience the minute that I
set this thing up so I can run around
and try to do this like kind of the
old-fashioned way and like plug in on my
sisters Network and plug in and you know
my friends network and hook it up to my
cell phone and see how it works there
but really I can only test a very small
sample of the world kind of going that
way and that's that's that's where we
run into trouble it'd be really nice if
once we deploy this application let's
say to a server we call AWS and I don't
have ICMP on for it but I can actually
see what this is like on the real
Internet now in this particular server
setup I have separated the CDN which I'm
going to use line ode as my CDN from AWS
which is where my application logic is
so I think this is a pretty typical
setup you might be using CloudFlare you
if you're at Netflix you're definitely
using open Connect which is fantastic so
your CDN it I'm not gonna get into like
judging the quality of CDN so how many
servers they have or how much you want
to spend on CDN stuff but the CDN itself
is pretty much just static assets so in
this case I've loaded the video files
and these images onto the CDN and I have
not not when I'm in dev mode like I am
on this particular server but I added
this config flag up top which I can show
you in this isn't a player in main so I
added this can config flag right here
which is where to find your CDN so this
takes a URL now right now it's I had it
set up it defaults to the local case
because I'm assuming you want to run it
on your own machine but if you were
deploying it you would change this URL
to be you know what your real CDN URL is
and we're gonna use and it looks like I
have a path issue with that too which
still loads fortunately so in our case
we're gonna move that over to a server
which we
all CDN so it's that way I can say
alright let's let's go look at the CDN
server and if we were just loading that
page nothing is here in the default page
I just put empty alright and this and
it's important that I keep adding that
HTTP colon slash last otherwise it'll
turn it into a search kneebar fun so I
guess you're supposed to use CDN dot
local but anyway I didn't so this this
particular it has exactly the same file
structure so if I look in the static
this is the result of the encoding so
when you run the encoding example that I
included then it creates this directory
CDN including I just tarred this this
directory up and copied it straight to
the CDN so if I go and I look in here
I'm gonna see this box art and movie
directory I'll just pull up a piece of
box art so take like this movie so if we
do this and we do box art and we do this
particular movie all right well not the
most engaging box art that I've ever
made
yeah see that's what happens when you
don't put the HTTP colon slash or a
charm
so if we do this okay so here's a
slightly more interesting boxer but
anyway so this is these assets basically
I just took this directory and copied it
to the CDN so that's what I'm going to
use with AWS and I copied the
application to AWS so if I go to the
server called AWS handy that I did that
and I put it on port 1990 then here's my
same application that I was running
locally so in this case I'm gonna this
is this is the application running
locally but now we're actually we've
deployed it right it's running on a
server that's somewhere although now I
know how to know where it is I can say
show me where the AWS silver is and
it'll tell me it's gonna go through this
whole path to San Jose as opposed to the
path to the CDN server which is going to
be a little bit closer and you'll notice
from the times that it is actually much
closer the other one was on the order of
80 milliseconds so even though it's
fewer hops it's further away physically
and that leads to more transit delays
but this one a few more hops from my
particular network to line ode but only
33 milliseconds okay ish so all right
great does this work yes well it turns
out if I sir come see it's not not quite
as fast anymore now it's there's a
little bit of load we can start to see
as we go into our application and if I
come in here and I hit play oh what oh
what is it doing I wonder what it's
doing that's not how it worked in the
lab I didn't have to wait that long what
was that like five seconds like I mean
come on you know what's what what just
happened
wait come on the real internet now this
is the real Internet this is actually a
really fast internet connection and
that's why I built this so what we're
gonna do for this is I built a virtual
machine for today and this virtual
machine is
let's let's get rid of this browser here
this virtual machine is hiding right
here and this device that this is just
stock latest Ubuntu 20 whatever and I
installed chromium on it and now I can
access my website except this particular
device is routed through this other
device and that's what I'm going to show
you today this this is a device and it's
responding too fast because that's a
problem with the serial port this is a
operating system which is called manna
OS and this is what is running this is
something I worked on it's open source
and it's available now but I put it on
these hardware boxes and this is this is
a one of the ones that I worked on again
low end computer it's got a couple
network interfaces and it's a special
router because unlike a normal router
this particular router can change what
your internet looks like and it can it
can make it look like anywhere in the
world so let's say that I cleared all
this out and I wanted to go somewhere in
the world like so we're gonna start off
easily now I told you before that we
learned a little bit about ASNs and we
saw the example which I'll just show you
real quick of level 3 so this is a SN
33:56
so I can come over here and take a look
at a SN 33:56 and it'll say oh yeah
that's level 3 no funny I used the same
table so we're gonna add that network as
one of our favorites now when I go and I
hit this icon here and sorry about the U
I didn't have a lot of time and kid cam
knows about my CSS skills but when you
go and you hit this you're gonna see
this little map over here and this is
let me just blow this up a little bit
this is running on the magic modem and
this is what network traffic on this ASN
looks like throughout the day
so this is at our zero all times are in
GMT so this is at our zero this is you
know so then it hits peak and then it
kind of comes back down so what it did
is it actually changed my internet to
reflect that internet so when I just set
this thing to a particular ASM by the
way go ahead tell me any SN you want
because I want to see what it looks like
and you can just tell me like and this
this doesn't look up IPS we could look
that up with the other tool I need to
add that but we can actually just look
up by IP we can look up by country show
me all of the a the a SNS that are
registered in the US there's a lot
there's level three yes everyone a
little bit out of date and some of this
data so we're gonna use something that I
used a lot while I was testing it this
is this is Comcast 79 22 so this is kind
of Comcast typical broadband looks a
little bit different from the level 3
Network but then this is an end internet
provider and then what as soon as I
click that it changes my simulation for
what these network characteristics look
like and let's take that let's try that
out so we're on Comcast now and I go and
I load and I remember my HTTP I go and I
load my AWS server ok I didn't it's not
there it's on before 1990 ok this thing
loaded pretty fast and that's not
surprising considering the fact that
we're testing this thing on Comcast
which is similar to the network like the
cablevision network that I'm on you know
which is let's see I'm not sure exactly
which one it is but I could I guess we
could see
yeah so let's just take this one and
figure out what is cinema one oh of
course it's data issue this is on the
transit network this is actually where
it is but this one might be registered
so this is why I really need to add that
IP lookup to the the other tool but even
if this isn't the exact network yeah of
course the data is not there for this
even if it's not this exact network I
know that any of these networks are very
similar to the network that I'm on
because they all have pretty much the
same sorts of characteristics okay this
is Telia so we could use this this is 1
to 99 this is a transit network again so
we could we could shape to that and I'm
gonna see this is this includes a bunch
of data for this particular network well
ok that actually changed my connection
so when I reload this thing it pretty
much looks the same as it did you know
when I loaded it on AWS in my normal
browser so this that wasn't that bad
except now let's let's have a little bit
more fun let's see what this looks like
if we were in Brazil like we were
earlier so let's put let's pick a
provider in Brazil I don't see is there
no I was not registered under cloud
let's let's let's take this one
Telecom Limited Namita I guess I don't
know I don't speak Portuguese so who'll
try these the ones with the lower ASN
numbers they tend to have been there for
a while so yeah let's see here's kind of
ok so this this one bounces quite a bit
again Network conditions change on time
of day so but my ping times are quite a
bit higher my packet loss is a bit
higher and this by the way this is the
the ping - this is the AWS Akamai and
CDN now in this case I've got
lino loaded as the CDN but typically at
Netflix this is open connect and so in
this case like it's very open connect
it's very very close to this particular
server so the CDN time will be very fast
but it'll it will change the settings
for the wide area only go to AWS so I
reload this well it's not much on that
page so it still loads pretty fast until
I come and start searching and yeah
that's not instantaneous anymore okay so
let's let's get a little bit further
let's let's I like to go to who's Becca
Stan so we're gonna let's try this
who's Beck Telecom now the times are
significantly different for this
particular high speed and there that
time we actually saw the page load and
it was a little bit different and now
when I'm doing my search yeah that's
even though that probably hit the
browser cache we should we should really
disable that no I think about it so
let's let's come in here let's let's go
into our network and let's make sure
that we're yeah okay cache is disabled
so let's reload and yeah that's I can
see that this experience looks different
than what we just tested and when I go
and I hit play now this yeah I can see
it's starting to load the resource here
what's happening man the real internet
different this is not what it was like
at all when I was building my
application now again this is a little
bit of an extreme example I mean we
don't need to go as crazy as this
provider in Uzbekistan but you can see
that this this thing does not load
instantaneously and now we're hitting
some packet loss so the connections
actually stalled some of those routers
in the middle they weren't happy and
this is a even though that this resource
is loading off the CDN and it knows that
the shapes are different yeah
that took a while I mean that that did
not load immediately that was 41 seconds
to get the asset that was a blocking
download this is this is bad right this
is what are we gonna do with this we're
software engineers how do we deal with
this I mean we can't just change the
network I mean we can't we can't bring
everybody fiber tomorrow I mean Oh
that'd be nice
well chat any suggestions what do you
what do you think we should do while
you're thinking about that I definitely
taken the opportunity to drink some
coffee we want this application to work
globally this is not working great
globally I mean this these are users
where we're not gonna be able to get I
mean let's let's take a more common like
connected example I mean Hong Kong very
affluent market pie Pease knows all
about Hong Kong and let's see what which
what do we want hkt China Mobile in Hong
Kong Chinese University Hutchinson we've
got everything is here so HK BN is an
all fiber provider so nicholai
interesting dude so you'll see the times
are really really good you know it's
very low loss on this network on this
ASN of routers there's you know time to
AWS is not super short but time this one
has the CDN on its network so it's super
good and we're seeing about 37 megabits
so whenever somebody starts telling you
about network shaping if there's one
thing you're gonna take away today
realize that just changing the bandwidth
is not a very good simulation if you
went out there in the field if you set
it to like you know 5 megabits or
something but you left all the loss and
you don't get into the asymmetry like
the real world internet doesn't look
anything like that so this is and again
if I reload this player all right this
is gonna look a lot actually it's I'm
not doing anything until play starts so
until I tap play yeah that was pretty
good right that
that loaded for seconds I mean that's HK
BN if you happen to be lucky enough to
be on a fiver let's let's pick something
in Denmark whoa whoa I don't know how to
say this right okay so this is a pretty
well engineered Network or between 1224
milliseconds pretty much all day 47
megabits okay this is a monster of a
network actually so maybe why we was not
it's considering Pluralsight offer oh
and how about a fee how are you
Pluralsight is that oh no they're not
registered under that name
let's do telefónica let's let's do
something a little more all right
Telefonica in Spain yeah these times are
not so good so in this particular case
open Connect is very close Akamai not so
close probably not peered on this
network
AWS pretty good and relatively low loss
there's a few problems in this data pack
but all right three and a half megabits
this is probably a reasonable connection
for Western Europe so let's let's try
that out hit play and it's be really
nice if I could record these sorts of
times like if I press this button and
then I know how long it takes before it
starts playing I can call that tap to
play let's say that would be a useful
metric for my user and I keep doing it
sort of by hand we're doing it manually
by going around the world but let's
actually add that to our application
let's let's start collecting real data
from our user users you know as as soon
as we get this application later you'll
notice with some of the packet loss this
this stalls occasionally and that's you
know that's the real world this is what
the Internet's really doing you know so
we can already see that this application
as we built it is not performing that
great it's performing it's not
performing that great fo fee
doing well today
here well as well just going through
some routing we can travel this is the
practical teleportation we're here we
can use this interface which is really
just hooked up to this I'm running it in
a VM so you can do it you can do it on
hardware you doing a VM and it's
changing the shape based on all of these
different providers now this is kind of
great you know just sort of it works it
works fine in software I built a
hardware device mainly because it's hard
to test this with like a mobile phone or
with you know if you're worried about
like mobile performance or if you want
to hook it up to a TV or a set-top box
or any of the stuff that Netflix had to
work on then that's kind of why I built
the hardware version of it but as you
can see here in the virtual machine you
can run it in software but what it does
is all of these shapes and all of this
data well I'll get to that a little bit
later but I'm gonna collect it for
myself like say I wanted to build my own
right here and let's um yeah let's let's
do that let's add it to the code so in
this particular case I'm running I'm
gonna work on my local one having issues
with my internet yeah this is all about
the Internet do you know any really good
monitoring applications to see real-time
up and down speeds I do let me get into
that a little bit later the if you check
back earlier the traceroute data is
actually the best stuff that you can
possibly get but anyway and also it's
easy to check your speed which you know
this this shaping right now it's changed
my speed to match the particular
provider I chose so in this case I was
targeting around 3 megabits and you can
see fast calm kind of agrees with me so
we can this will this will eventually
kind of settle on around that speed but
if I change that back to say HK BN o or
Wow which actually had a ton of I don't
even know if I have that much bandwidth
here you can't simulate anything that's
faster than what you've got
so I wouldn't I change that immediately
you'll notice fast comm changed so this
is this this changes instantaneously in
that app like like packet to packet
so this yeah I can I we can switch back
over to here and then suddenly fast it's
gonna kind of fall apart it's gonna get
upset it's like something impossible
just happen how did your internet become
that bad that quickly and kind of tough
off ease point you can actually see that
in the time of day where it shows you
like this is how it changes even
throughout the day like you know this
particular provider is fairly flat in
the sense that it's like from 20
millisecond well
sorry this was not sure it wasn't
updated this is the uzbek telecom
provider got a little graph bug there
which which varies between ninety and a
hundred thirty five mils things so this
actually bounces quite a bit packet loss
changing through the day but when I was
on in HK BN this is all between 60 and
70 milliseconds so it's this is a really
well engineered provider and
comparatively speaking you could
actually look we could look at yours in
here if you want coffee - so just let me
know this all right so let's tune this
thing we're gonna tune this thing let's
build the metric first so I'll build it
on yeah you know what actually I think
it'd be better if we built let's just
build it directly on the AWS version
because this if I do it locally then
we're not gonna well actually you know
what I take that back we could now that
we know that this this simulation kind
of works even if I'm going to AWS I'm
going to it yeah let's stay on a two
because I take it it's more fun on a
nebula so we're gonna connect ourselves
to a them us oops
and
all right here it is let me have this
open it another window so let me just
detach that and here we go okay so it's
not loading the video file this is just
the player UI with the tag so I saw that
over here so this is it passing along
like what what it's tags gonna be but
you know when I load this you know I can
see yeah this immediately loads that's
my UI but then I go and I make my hit to
the CDN and it doesn't actually load
anything off here all of that's coming
from the CDN and you do get that
asymmetric like real world simulation
with this so HK bien again very nice
Network so you can see me moving the
mouse here the the cursor in the Delta
frame managed to catch that one nicely
so let's add some metrics all right so
we go in here and we edit our main
because we're live changing stuff on the
server yeah because we're gonna roll
that way okay let's I think yeah so this
is gonna change our yeah let's let's
start with the JavaScript first so I'll
load up the template and we're gonna
work in the player because we don't
we're not directly concerned with what
how many like what the ping time is how
many milliseconds there are like what
the bandwidth is those are network
characteristics how many hops we're
going through what we really want to
know for our application is when the
user starts trying to play how long does
it take to play and that's an
application metric not a network metric
so that's an important distinction
because we want to tune the application
metrics we're not you know the tools
that are here to test all of the
different network stuff they're amazing
but that's we're not tuning the network
we're tuning in application so we're
gonna have to try to work around how
this really works on the Internet
okay so great let's um I'll just keep
this up here so why don't we come down
here to our play
and of course my highlighting doesn't
work when I jump all the way down
because I should not I should really
separate the JavaScript from the rest of
the file so I left myself a few - dues
but this this play kind of captures what
we're doing we'll get to a few different
code changes that I did but basically
what we want to know is down here at the
bottom is my actual play button action
and when it does that it comes up here
to fetch and play or if I do it via key
I added the key handler so that we could
actually use the DualShock controller we
use last time and when we do that
originally when we wrote the episode we
had hard-coded this but now we were
taking we're taking the movie directly
from this this takes the CDN so if right
now it's hooked up to the real CDN so
it's running online oh no the real fake
CDN because I'm not using line note CDN
and this this URL it's it's it's gonna
try to load it there this is still not
the user intent when it's fetching this
URL actually no I take that back from
the moment we're in this routine the
user has pressed play so if we want to
no measure like kind of what I was
looking at in the console before where
we press play we initiate the playback
and we want to see how long it takes
before it starts playing that's really
this kind of time right here which is
that fetch URL and I said we were gonna
get more specific than just fetch URL
and that's kind of what we're gonna do
so I'm gonna throw this in the global
space I'm gonna call this the start time
and let's let's set start time here to
be you know we're gonna use JavaScript
timer for this just you know we're it
we're not gonna worry so much at the
millisecond level as the like second
level I mean like I'm gonna see gross
effects here so I'm gonna take this
start time variable I'm gonna bring it
up to here so that we don't
have to forward reference it because
this play function is the next thing
that gets called right because if I if I
look down here it basically fetches this
URL parses the JSON which should be
pretty fast in native library and then
it calls play so in play I actually have
a night I know that I have the response
came back so I could record like how
much is Network time versus decode time
for want to get that specific but let's
talk about tap to play as a metric so
I'm talking about tapping that button to
the first frame of video shows up so
that's this first image and this is
actually what shows it so I'm gonna take
my timing as right after this so that
would be my playback time so look let's
say that the end time is right here so
let's let's go and I'll just leave that
local this will be date now and let me
just double check that I've got the
right little rusty on some of this so if
I do date no yeah okay so I'm gonna get
a number like this which is this is this
is an epic time slid over to add in some
milliseconds here so I can basically
take these two and take the difference
of them and that's sort of my my my
metric that's my my tap to play so if I
just output that I can say tap to play
was really end time - start time
okay so because we're on this server it
doesn't at all reload which is
unfortunate so get that oh yeah because
it's for editing the JavaScript all
right so I reload that I hit play
hopefully I left that I'm one of the
fast networks and as soon as that
finishes I get a tap to play so that was
you know and again this is in
milliseconds I believe so that yeah 4.3
seconds if I stop this and just take a
look in the network tab yeah that's
that's about right so I can see that
there's a little bit extra past the
network because it has to do some
parsing and then it has to draw it on
the screen so that is my time to my
first frame roughly I mean it's at the
end of the function call let me know if
that just doesn't make sense but
basically that's our tap to play that's
going to be our key metric for tuning
the video okay so let's say our tap to
play is equal to what I just did
end-time - okay so all I really need to
do is send that up to the server and I'm
gonna do that so down here I'm gonna
actually I'll just send me this is after
the video is done I don't want to kind
of get well it's not because this is all
gonna be a callback so this JavaScript
will just run straight through to here
but one of the things I'm gonna want to
be careful I don't want to interfere
with a request from the browser right so
if I'm sending metrics but try to find a
good time in your life cycle to admit
your metrics so let's say that yeah
let's let's emit let's imagine
click tap to play and what if we just
calculate we calculate it has to be okay
so of course that doesn't want to work
so if I have my Amit what I call the MIT
metric let's say a metric name and value
so all we really need to do is well
first I'm gonna adjust this so this is
going to be metric name and all right so
if I run this real quick then I should
be okay yeah okay no no no error so far
and yeah it comes back and this is going
to be what I want to get up to the
server so to do that let's yeah let's
just let's run fetch again and let's
let's make a new endpoint so right now
we have a play endpoint so let's let's
make a metric in POI
now in a real example you would probably
store this in some sort of data tier
you'd want to do this very quickly on
your own your metric server but in our
case I'm just gonna I'm going to omit it
as a log line because we only want to
see like from our user we want to be
able to collect telemetry from the user
to say this was the experience that they
had for a metric we care about so this
is something we care to optimize so
we're just gonna the response will be
you know just I'll just say okay and
what we need to do in here is I'm gonna
pass this along as a query parameter so
let's say this is metric and then well
here let's let's build a little object
so let's say that this is going to be
metric name and right now I've only got
one metric but you've got to start
somewhere right this is our first piece
of telemetry on how our application is
performing in the field don't guess
measure it just it's not that hard you
saw how hard it was the measure I just
the end time start time that's usually
it there's there's all sorts of advanced
metrics you can get into most of it you
could do just exactly the way I just
described and it's amazing how few
places do just that you know there are
they go and they outsource this it's
like just just collect it yeah take it
two seconds okay so this will be our
bundle and in the bundle we're basically
just gonna let's say during a fight and
I'm just gonna json dot stringify string
if i this bundle
actually I shouldn't do that you know
what I'm gonna be URI
let's encode it let's let's do a code
and code URI component cuz then I could
just pass this along as it get parameter
I don't want to get into like cores
because I had some cores fun getting
this far so no normally you would post
this but for right now just to simplify
it I'm gonna I'm gonna just send it as a
query parameter
okay so stringify diz going to be the
encode URI component this bundle so
that'll just be a flat string that's
safe to pass along so I can say the say
that the query parameter will say that
the data is this string of five and then
when it's done sending it will just omit
you know data sent that tricks emit
metric big good very very good text okay
so this will return it and this is
assuming that it's the same server and I
want to get into core stuff but this
this will basically just send a request
back with that data which really only
has the tap to play in the bundle so
let's let's see what comes back so we're
gonna say if request and this by the way
this is just a flask app so if I juice
to request args and I look for what do
we what do we call this D so let's check
if D
in the quest arcs okay so if there is
data then we're gonna admit it now you
would probably store this in like Redis
or database whatever you want to really
persist this but for now we're just
gonna we're gonna use the log as our
persistence so metrics data looks like
what you want to say this is a quest
args deed
okay so that should be it
so if I go here I won't run it through
the simulation you'll get the full speed
internet let's go to the one that's
remote so it's I have it on ninety
ninety okay and let's go in here I'll
just pick the first one and play then I
should have gotten back yeah I get back
a metrics that looks like I've got an
issue there so let's just take a quick
look at what showed up oh I'm sorry I
was showing you the other window let me
let me show you the the local browser
window before we go into the simulated
internet version of it and yes favi
cores is just I thought about maiming it
with the con scream cores that was that
was fun getting that CDN to work let me
tell you anyway so it looks like
JavaScript sending along something I
don't want it to send so let's let's
just take a quick look here with this
and
well it's let's just console.log
stratified so this should just be a
regular string you know what I think
yeah I'm gonna send a full object like
I'm doing then I'll need to do both
so let's stringify it cuz that's
probably the bug so right yeah see there
you go here's to getting lucky so yeah
that I have to unwrap it right so flask
will automatically um encode it so it's
it's encode you are I'd URI encoded I
should say and so it'll automatically
unwrap that but it this is still just a
string in in flask so if I wanted to get
that as an actual object I would need to
say json dot loads request starts okay
and this is gonna be my new metrics
object and of course that could be
unsafe because i'm doing i don't know
that that's valid data so let's let's
hold that for a second let's just let's
let's write the negative oh you know
what it throws a value error
it's a json - code error but it's a type
of value error so i can i can do this
and say accept value err just ignore it
because it's just bad metrics data so
we're not worried about that case and
this will come across and now my metrics
data should be metric so okay and we'll
let it string to fly that in python
sorts of ways all right so i do that hit
play yeah now i can see that the the
quoting changed so this is actually now
an object there you go alright so now
we're collecting a real user metric now
let's see what that looks like this was
nice because i only had to wait 61
milliseconds when i ran
my unshaped internet because the
download probably hit the cache and
everything was really fast and wonderful
except we already learned the real
internet doesn't work that way so let's
go back to let's start off in Hong Kong
with HK bien Knicks people and let's
reload so that we get the version that
has the metrics in it alright great it
loaded that's cool so I go and I hit
play I hit play I get to weigh ok cool I
go on a hit play and then yeah that's
that was 8.3 second so we got unlucky
with the packet loss that time and it
stalled the connection so in this case
yeah it was 8.3 second so now I'm
collecting the real metrics that let me
that see we can get telemetry once we
know what it's doing we can start
optimizing it cuz now we're not guessing
anymore right I can try this in
different parts of the world I can go
over to wahoo and I can run that again
and this time my metrics blob let me let
me just reload the app so that we know
we're fresh so let me try that again
we can watch it download because we're
actually on the client at this point but
when we deploy our production
application we actually get back some
telemetry now that we know that we can
optimize we can say seven point one
seconds that's too long for our users to
wait it's not a good streaming
experience we want it to happen fast we
need it to happen fast oh I've had
coffee now
oh and sorry father so are you using
flasks for the API framework and Python
yes I am using flask we're using the
development server right now I have it
set up with you WSGI behind nginx that
that's for kind of bigger apps I'm not
really too concerned you could write
this a node it wouldn't change it too
much
I like Python mainly just cuz it's less
lines and I think it translates better
to two people that are may or may not be
familiar with various server texts look
I don't want to get into like a Java app
we'll spend like the whole episode just
setting up our classes what not to get
anywhere near JSON parsing in Java Oh
that's fun okay we have a metric we can
optimize it so what are we gonna do well
there's a there's a whole world and
science optimization but the first key
to any optimization is find out what's
really happening before you start
optimizing now we could take all of this
metrics data collect it along with the
IP for where it's coming from and start
building a map if we collect all of that
data and start shaping it by ASN as a
collected a collection of routers then
what we can do because we can assume
that networks are relatively consistent
across næss and that's not a hundred
percent true there's there's a few
global multinational ASNs that are very
much inconsistent across it depends very
much where you are but a lot of the
smaller ISPs especially the ones that
are much more regionally contained they
tend to have fairly consistent network
engineering you can watch my whole
monitor AMA talk about that if you're
really interested but that's actually
where this shaping data comes from I
added this shaping data to the Netflix
application so Smart TVs and let's say
Smart TVs smartphones there's a web
version of it and I didn't finish
deploying that sorry Bogdan those are
feeding this data we have iOS data
that's in here so we're seeing mobile
networks we're seeing broadband networks
we're seeing Wi-Fi we're seeing wired
we're seeing all the different
variations and when we take all of that
data do a lot of modeling then we
produce this data pack which gives you a
pretty good idea of what things are like
that's a continual process we need to
kind of keep getting better at that but
right now that's pretty much like this
is kind of like how it began I basically
collect a ton of this application metric
stuff there's really more on the network
side you know on how long did different
network flows take so that's like if I
do each of these fetches how long do
those fetches take so that I can build a
pretty good Network model of what the
internet looks like well great I'm going
to tell you right now I think the best
thing to do so
a lot of ways to optimize this we're
gonna optimize it the way a lot of
streaming services like to optimize it
which is I told you in an earlier
episode that we have a not this was I
was quoting Anthony Park that there's
sort of an Iron Triangle to streaming
and those variables are really important
there it's a constant trade that we
always get to make so in this case we're
keeping full quality but we're making
the user wait that's not good now we're
not getting into the rebuffed ur
situation because I haven't done like
multiple segments on maybe I'll do some
code for that before another stream but
that's where we would just play one
after the other because if it stops in
the middle that's a rebuff er right
that's bad then you start showing the
spinner while we're loading the next
chunk that's that's really bad Netflix
tunes that way out so for right now
we're really only going to worry about
quality and speed and we saw speed and
we don't like it waiting eight seconds
for a video to start is not good so we
we want that to not happen and so the
first thing we can do I mean we can
start pre loading the inning there's all
the different programming tricks that
you can use to optimize you can read
about performance stuff but the simplest
one to do is we have to send all of this
data to start playing this is 18 Meg's
of data so we can improve our encoding
we can we can compress further that's
there's a lot of techniques available
that if you you watch the compression
episode so that's one area you could do
we can't do that very quickly very
quickly we can have the resolution and
scale it up so this is 240 P and I
mentioned we would do this in an earlier
episode I actually checked this in for
you so if you look at the encoding tool
the encoding tool got an edit which now
basically says scale so before it was
always running it at whatever in code
you gave it so if you gave it a 240 line
you know if there are 240 progressive
lines in your video then it encoded 240
progressive lines well now it accepts a
scale parameter and that's it was
relatively easy to implement
but I think the details aren't that
interesting basically I just resized all
the image so if you take all the images
that are 240 and you keep the same
aspect ratio and you have them you get
120 and now that I've encoded that for
you which we could I was reading coded
if you want to see not it I don't think
that's really that much fun but yeah let
me know I'll just do it real quick so
that you can see what it looks like so I
included this in the previous stream so
we basically I'm just gonna use the same
virtual environment this is Python and
I'm gonna run my encoding script alright
it's already encoded so it's not gonna
tell you very much if I if I come over
here and I delete my CDN now it's not
encoded then I can go back here rerun
the encoding and what it did was it
unwrapped every one of the frames in the
video so if we were doing like 30 frames
a second there's 30 of these frames
every second and since I'm using these
clips then there's 900 frames in a
particular clip that's yeah 30 seconds
so it's doing each movie one at a time
and it's encoding it at 240p and then
when it's done it's gonna Rhian code all
of it at 120 p and if you do all that
and you deploy it to your CDN see this
takes a little time to encode so we can
can watch it in code and while its
encoding I can go take a look at chat
Coffee says I haven't used flasks we use
EF at work in production and nodejs
for POS is that like proof of feature
I'm guessing an early dev since we use
angular in the front-end so I'm only I'm
only used to using node and EF for API
yeah that's totally fair I mean it this
doesn't change much if you're using node
I mean obviously you gonna be writing in
JavaScript as opposed to Python but I
mean the code is not that different I
mean like if you're using something like
Express and you have routes you know you
can define exactly the same kind of
thing like like I'm doing maybe if it
really helps I mean I could do some
this in in JavaScript in a future stream
but the Python I'm really using for
compactness cuz it's a lot easier to
express an idea in Python I think than
it is to do it javascript seems to be a
lot more lines it's easier to get lost
as you're doing it so yeah I run I run
all that Eirene code all my movies now
this by the way you can imagine when I'm
only doing 6 30-second clips now
obviously I didn't it optimize the
encoder because I just did it quick for
the show and we didn't we can work on
optimizing our packing format you know
so we can stop storing in JSON maybe we
could do binary serialization there's
there's all sorts of different things we
can do to optimize it we pick a strategy
but the thing is keep the ground truth
in mind right what are we optimizing
we're optimizing start with the metric
we're optimizing this metric so if I'm
gonna compress more that means less bits
going it's probably gonna load faster I
can see how I did based on this metric
don't guess I am proof of feature comes
out I'm the only one yeah it's I don't
know people are not very talkative today
I'm talking a lot it's a lot of material
to get through I apologize for that
I realized last week when I was gonna
originally do something like wow this is
this is a lot it's a lot of stuff so
anyway great we just re encoded all of
our stuff I could take that copy it over
to the CDN I already did that for you
and include it in there
if I look in the media in this version
is these 120 P and there you go so now
here's all of the movies available in
120 P and they're also available in 240p
that was a very easy optimization
because all I had to do it was really
like one line I mean aside from some of
the machinery to be able to pass the
scale through to all these different
functions all it is is really just this
it's an image resize and I just keep the
same width times a scale factor same
height times the same scale factor
anti-alias the result send it back and
and that's how I modified these routines
so they're very very simple so this
pretty much no ops in the event that
it's saying keep the same size it just
doesn't do any of the math but in the
case of like the 120 all it did was it
took the picture and shrunk it and then
it rien coded it exactly the same I had
a quick bug fix I realized where I was
off on one of the parameters in the
player but now that I have that let's
see what happens so here I have my
metrics now my last metric was from here
and it was it took us six point seven
seconds to get this 18 Meg file yeah
there there goes the 18 Meg's now I'm
getting lucky it's not a lot of packet
loss okay there you go
so that was a little bit slower that
time again the loss is really what what
changes this but so it was about seven
seconds yeah and that's by the time I'm
drawing on the screen seven point four
seconds okay let's do it
with 120 P file look at that look at
that one point one point eight seconds
so trading quality is a big deal and if
you're wondering why a lot of streaming
services start off in lower quality and
then they move up that's why now what I
did in the app you'll notice these
pictures are smaller this is the decode
I kept exactly the same decoding logic
but I added this window and what I'm
doing here is I'm just scaling it so I
ran into this problem when we were you
know first doing it in the episode but
basically I used a hack work around it
basically just capture the canvas as an
image data URL and I scale it up now
that's something you would do in the
video hardware there's better ways to do
this I could actually do it in the bit
but now I have a metric I can work with
and that thing loaded quick on exactly
the same internet we went from like
seven seconds to one and a half seconds
and what does that look like when I
start checking around the world that
makes a big big difference okay so if I
come here to telefónica like I was doing
now let's let's go back to telefónica in
Spain and I just want to add that as a
favorite so so when that change okay
this particular thing you know the
here's telefónica in spain we're at much
lower bitrate so we come here we reload
the page now we're using half the
resolution hit play yeah this is you
know still slow but you can see that it
has a lot less data to transfer so it's
going to get through this and it's gonna
go great start so and then we can look
at the metrics data that's coming back
now again this is why you want to know
which is P this came from because the
connectivity is gonna be you know
different for different parts of the
world so yeah totally agree fuck it
would be a little awkward if I wasn't
talking my stream okay so this this is
cool this works that's that's magic we
can see how this changes now basically
we can travel everywhere and we can see
what it does
for different optimizations that we can
work on so obviously it's blurrier I
don't know how that's coming through on
the stream you know what's not coming
through at all because I've got chromium
covering it up Oh
so yeah this is it basically loading
this was the window I was looking at and
when it came back and it finished this
is a little bit blurrier you can see on
this one 20 p version that yeah this
wasn't this wasn't as much fun it was it
was a better experience but we're still
not quite there for a product release if
we want to get those users and again the
more ISPs that we can get this to work
for that's more that our products were
for around the world it's more potential
customers that's a big deal your this is
gonna come up in your career so yeah an
HK bien it started pretty fast I got my
place start down to 266 milliseconds
that was a sorry that was the wrong one
no this was let me let me let me clear
that was that was from before so we're
gonna go back to recording I actually
did get the metric up here but it was
1.7 seconds on that start so if I reload
I know I've got my cache disabled and I
hit play this is kind of the number we
care about it downloads this thing and
that's it
we got our metrics response stop this
for seconds if I change to the 2:40
version we saw before it was about seven
seconds so if I come here and I say this
let's do this at 2:40 clear our log here
clear all this we're good to go we hit
play yeah so obviously the quality is
going to be better but the user has to
wait longer it's not it's not as crisp
and quick and experience so and that's
something where we could get really
advanced over time now that we can kind
of really see what it's like for the
user this is something you could the
hardest part I think of solving any of
these internet issues is just knowing
what the internet looks like and knowing
being able to be in Hong Kong or Brazil
or anywhere around the world and
experience what that's like it's very
easy to optimize the target when you
kind of know how the target works like
as you can see I can do my coding
changes here and I can get immediate
feedback and then I can confirm it with
actual data from the field right this is
this all kind of works so that's the big
topic let me let me know if any of that
is sort of confusing so when we have all
that done you know where can we go in
the future this obviously
there's a lot that we can do here so we
can we can start working I mean a real a
real streaming service is going to
adaptively switch between these
different quality factors depending on
your internet so we could put some logic
in here but like okay the Internet's
downloading freely I don't know I don't
have any previous data just take the
lowest quality one start as quick as
possible
collect some data keep some history up
upgrade based on what the connectional
support see how often you know how long
does the page take to load I could add
another metric I could say what's the
time to interactive what's the time to
being responsive when when is it that
the user gets a chance which metric do I
want to optimize I can do better job I
can do a better job compressing yeah how
much is binary serialization worth well
it'll shave you know half the 50%
savings in terms of what I'm
transmitting over the wire that's a big
deal
I can this application is gonna work a
lot better it can be a lot more
responsive I could start at a higher
quality level if I'm working on a
general application I I can get a real
sense of what how it works for my users
you know what's it like over a cell
provider you know like like what if I go
and I hook this thing up to like I don't
have t-mobile loaded here but what do
you you know let's just pick a cell
provider yeah I mean we could again
you'd want to see what particular IP
you're getting sighs I don't want to
take a look at like you know like AT&T
or something but I I can actually test
it exactly as it is cut out in the field
and that's that's what's sort of amazing
about this immediate feedback it makes
optimization because there's all these I
mean as a programmer you're always kind
of guessing you're like what what I
don't know I think this is better this
is less bit so it should be better
sometimes the data doesn't agree
sometimes you don't have any data this
so that's where you really want to
collect this let me tell you that's a
much better argument when you're at work
and you're like look I can prove that
this one is significantly better than
the other and that's one of the things I
really liked when I was
Netflix is that a lot of it is
data-driven so the decisions usually are
like okay show me a metric that tangibly
improves that builds a better experience
I can break this into smaller chunks
right now I'm using thirty-second chunks
I could use two second chunks and have a
lot more room to switch up and down
quality as you saw I can rescale it
that's kind of how I get the the
difference and you know smudge enos in
the video or I could play around with
like the search you know when I was back
here on the application this well
actually it was quite a ways a little
bit ago I could see how fast it is from
when the user hits a key to when it
renders that would be a really useful
metric if I was trying to if I know that
this is the primary thing and users
doing in my application is searching for
some content I could see like how fast
so the keys respond you know I mean
again that's it's really fast here
because there's not much in this app yet
but that's that's that's sort of like
you know this is real-world software
engineering so anyway there's a lot you
can do hey ethics how are you doing I
haven't seen you in a few streams Coffee
says we do static images with dynamic
overlays at work if it takes 15 seconds
for the overlay to appear on the image
custom text or image on a shirt our
customers have a bad UX and they will
send complicit yeah I mean this is a
really easy way to see what's really
happening for your customers yeah
there's a lot of different tricks like
for example I have in here I've got the
text of the search result and I have the
the frame of the search result I might
run an a/b test which there's an a/b
episode coming up where I want to know
like is it better to show just the text
even if the image hasn't loaded or just
wait until the image loads because the
image is much richer to the user
experience so that matters more how do
you make that kind of trade well the
easy way to do it is measure the metrics
that matter you know I mean that that
could be the hard part but you know it's
not too hard to add these metrics so
Adam in start looking and then test the
stuff you know like you can run an
actual a/b test and say like okay this
improves the load time but users tend to
abandon more you know I mean if you're
doing like a sign-up funnel or if you're
doing a you know a campaign like a you
know conversions for marketing like say
you're selling an online product I don't
know what kind of product you're working
on fluffy but I mean yeah you you really
want to have these metrics built around
some particular goal like is it how much
people stream is it how much they look
through the cow I mean what are you
trying to optimize then figure out what
you can measure then figure out like you
know how you can actually see what it's
like from their shoes and that's that's
what this is this will help you a lot
with that I will tell you this I mean
there's like a world of engineering
questions I will tell you this this is
available now I worked I worked on this
for a while and if you wanted to use it
I don't suggest you download it right
now there's there's some it's it's still
pretty alpha I mean that's you're seeing
it work here but I know that there's
some data pack problems and the most
recent data set so we're great we're
gonna need to do a little bit more work
there but this is available at men OS
org so this is I I called it men OOS
because it powers a device that I
lovingly named the magic modem after
Torrio's suggestion which is really a
router but it has a modem built in some
fun there but this you don't need to use
the hardware to use this the VirtualBox
version is available a little bit rough
give me give me till next week but just
in case this is the link you can check
out this beautiful website one of my
colleagues built so this is yeah and you
can read about it and use it if you want
like I said it does work in VirtualBox
or you could put it on actual hardware I
don't support that much hardware yet but
the data is that that is interesting and
has the whole story about how when I was
sitting in this is in Kazakhstan testing
Netflix which was sort of a fun
the time the way I got this to work
ultimately kind of came down to using a
lot of these techniques and it's really
about knowing which bits you can use
which you can sacrifice where does the
quality matter where's the quality hurt
where does it you know where is it
better to start fast and that's that's
kind of the real world of trades in
engineering so that's kind of it
that's episode five I hope you enjoyed
it I hope you all have a good time with
your other engineering activities it's
very hard it's hard field let's some
yeah let's let's find somebody to raid
so if you want to check it out so next
week I'm gonna be going back this is the
let's say this is yes Jean flix this was
CSG flex yeah this was the kind of the
real Internet this is where most
companies tend to go wrong with their
products so here's a tool that will help
you with it and by the way I'm gonna
show the right screen for that this is
this is cool I'm not sure we're gonna do
personalization I might I might do more
on the the application side maybe we'll
do chunks I mean we could do either
adaptive streaming or personalization
and we'll say let me see what I can kind
of pull together for for next week's
show I think here for episode 7
definitely gonna do a/b tests so if
you're interested in that and those are
kind of a lot and then you know we'll be
around to chat and talk about our new
streaming product and how it works all
around the world because we have this
wonderful globe which I think I'm gonna
add to the magic modem now that it it
can do these things and you know we can
we can travel to Brazil or we can travel
to Argentina or we can travel to Iceland
so I think it'll be fun that might be a
fun little visualization to throw into
the magic modem so
I'll get to that some point in the
future