{"id":168,"date":"2021-11-19T10:33:46","date_gmt":"2021-11-19T10:33:46","guid":{"rendered":"https:\/\/wqrld.net\/blog\/?p=168"},"modified":"2022-04-20T21:53:50","modified_gmt":"2022-04-20T21:53:50","slug":"statistics-for-engineers","status":"publish","type":"post","link":"https:\/\/wqrld.net\/blog\/statistics-for-engineers\/","title":{"rendered":"Statistics basic: stddev and z-score"},"content":{"rendered":"\n<p>I&#8217;ve been trying to wrap my head around some statistics\/data science used for dissecting ddos attacks, and came across a couple of new topics that are quite important but rarely explained.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Sources<\/h3>\n\n\n\n<figure class=\"wp-block-embed is-type-wp-embed is-provider-statistiekbegeleider wp-block-embed-statistiekbegeleider\"><div class=\"wp-block-embed__wrapper\">\n<blockquote class=\"wp-embedded-content\" data-secret=\"aWvm8uOgF2\"><a href=\"https:\/\/statistiekbegeleider.nl\/z-scores\/\">Z-Scores aan het uitvogelen?<\/a><\/blockquote><iframe loading=\"lazy\" class=\"wp-embedded-content\" sandbox=\"allow-scripts\" security=\"restricted\" style=\"position: absolute; clip: rect(1px, 1px, 1px, 1px);\" title=\"&#8220;Z-Scores aan het uitvogelen?&#8221; &#8212; Statistiekbegeleider\" src=\"https:\/\/statistiekbegeleider.nl\/z-scores\/embed\/#?secret=aWvm8uOgF2\" data-secret=\"aWvm8uOgF2\" width=\"600\" height=\"338\" frameborder=\"0\" marginwidth=\"0\" marginheight=\"0\" scrolling=\"no\"><\/iframe>\n<\/div><\/figure>\n\n\n\n<p><a href=\"https:\/\/www.wiskunde.net\/standaarddeviatie\">https:\/\/www.wiskunde.net\/standaarddeviatie<\/a><\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Standard deviation<\/h2>\n\n\n\n<p>Standard deviation is a property of a set that describes the spread around the mean.<\/p>\n\n\n\n<p>S<sub>x<\/sub>&nbsp;= \u03c3 = de standard deviation of the set<br>X<sub>i<\/sub>&nbsp;= The number i in the set.<br>X<sub>gem<\/sub>&nbsp;= the mean of the set<br>N<sub>x<\/sub>&nbsp;= the total number of elements in the set<br><br>\u03c3 = S<sub>x<\/sub>&nbsp;= \u221a( \u2211 ( (x<sub>i<\/sub>&nbsp;&#8211; x<sub>gem<\/sub>)<sup>2<\/sup>&nbsp;\/ n<sub>x<\/sub>) )<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/www.wiskunde.net\/img\/standaarddeviatie.jpg\" alt=\"Standaarddeviatie\"\/><figcaption>SRC: wiskunde.net<\/figcaption><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\">Z-score<\/h2>\n\n\n\n<p><\/p>\n\n\n\n<p>z-score: easy normalized way of seeing if something is above the average or below, and if it is an outlier (z-score &gt;3 | &lt;3 is often seen as a outlier)<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"http:\/\/upload.wikimedia.org\/math\/0\/6\/8\/0680b4b5027a142c544710088121eca5.png\" alt=\" Z = frac{X - mu}{sigma}.\"\/><figcaption>SRC: statistiekbegleider<\/figcaption><\/figure>\n\n\n\n<p>mean = average<br>Z-score = (Measurement &#8211; mean) \/ stddev<\/p>\n\n\n\n<p>In python:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>df&#91;'zscore'] = ((df&#91;'count'] - df&#91;'count'].mean()) \/ df&#91;'count'].std(ddof=0)).round().fillna(NONE)<\/code><\/pre>\n\n\n\n<p><\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Extra: Newton Binomial<\/h2>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/wikimedia.org\/api\/rest_v1\/media\/math\/render\/svg\/420bf080448b0b64ddd2eaeaa6a9c2cb8fd6923b\" alt=\"{\\displaystyle {n \\choose k}={\\frac {n!}{k!(n-k)!}}}\"\/><\/figure>\n\n\n\n<p><\/p>\n\n\n\n<p>if we take n = 10 and k = 3 (also called 10 choose 3). We will find the outcome to be 120.<\/p>\n\n\n\n<p>The newton Binomial is used to find the number of ways to choose k (three) elements out of n (10). Take for example the amount of combinations of toppings you can choose on a pizza when you can choose at most 3 from a total pool of 10 options.<\/p>\n\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>I&#8217;ve been trying to wrap my head around some statistics\/data science used for dissecting ddos attacks, and came across a couple of new topics that are quite important but rarely explained. Sources https:\/\/www.wiskunde.net\/standaarddeviatie Standard deviation Standard deviation is a property of a set that describes the spread around the mean. Sx&nbsp;= \u03c3 = de standard &hellip; <a href=\"https:\/\/wqrld.net\/blog\/statistics-for-engineers\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Statistics basic: stddev and z-score&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-168","post","type-post","status-publish","format-standard","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/wqrld.net\/blog\/wp-json\/wp\/v2\/posts\/168","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/wqrld.net\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/wqrld.net\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/wqrld.net\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/wqrld.net\/blog\/wp-json\/wp\/v2\/comments?post=168"}],"version-history":[{"count":6,"href":"https:\/\/wqrld.net\/blog\/wp-json\/wp\/v2\/posts\/168\/revisions"}],"predecessor-version":[{"id":250,"href":"https:\/\/wqrld.net\/blog\/wp-json\/wp\/v2\/posts\/168\/revisions\/250"}],"wp:attachment":[{"href":"https:\/\/wqrld.net\/blog\/wp-json\/wp\/v2\/media?parent=168"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/wqrld.net\/blog\/wp-json\/wp\/v2\/categories?post=168"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/wqrld.net\/blog\/wp-json\/wp\/v2\/tags?post=168"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}