Analyzing websites

The difference between a good webdev and a great one is that the great one can not just make sites, but also reason about ways to get more sales.
I will mainly be focussing on hosting and IT websites because that is my area of expertise, but you will see these things everywhere. I plan on writing another article on business models, writing, and maybe talks. But i’ll see how far i can get.

Example: DigitalOcean

Digitalocean is one of those sites that every web developer should have looked at.
I’ll point out a few things that they do right, in no particular order:

  • Colors and design stick out. The blue sticks out, and together with the modern design makes for a pleasant page to look at.
  • The subtitle instantly shows some examples of their most common customers, and is kept very generic. People who want to host an application that fits one of these categories is immediately ensured that they’ve come to the right place.
  • One-click sign-on with the two platforms that are most commonly used (and ofcourse a normal email signup). The barrier of entry is very low.
  • Sales button on the bottom right for questions, and bigger businesses that may need more convincing but bring in more value.
  • The numbers at the bottom of the page show that this is no small company and can probably be trusted.

Some other things done well on the DO website from a business perspective:

  • Case study – While i would argue that these are not very useful for most people, they absolutely help in convincing other people that a similar business is already using them and thus it should be good.
  • Free credits. The cloud hosting market is competitive but with limited-time free credits people can start deploying their app without any risk, and will likely stay if it just works.
  • Digitalocean has a very large knowledgebase of non-do related help articles for almost anything related to system administration. This helps both their marketing with increased brand awareness and gives an enormous boost to their SEO.

Website: https://www.digitalocean.com


Salesforce

Salesforce is a b2b CRM company. While i do not enjoy their design so much, they got a lot right in regards to selling.

  • 2 big buzzwords and their product type, right as the first thing you see. This site is obviously more targetted towards business decision makes and will immediately get them interested.
  • They offer a free trial, and a demo video so you can get in without any excuses.
  • The header also contains a “industries” button where they lay out how different industries use their product. This gives them a sense of having experience and shows examples of how your company can be improved by their product.
  • Also; a plain old phone number. Pick up the phone and you’re right there talking to their sales team. They also offer a livechat, but it’s a bit more hidden for some reason.

Site: https://www.salesforce.com/


Tesla

While the salesforce site was relatively plain, the tesla website again sticks out. You are instantly dropped into a video of a tesla car driving through harsh conditions and a man enjoying the speed – probably recognizable for their target audience.

They put a lot of attention into getting a demo drive. I presume because their cars are mostly sold from their website so that the usual demo drive at the dealer is less of a thing.

One thing that sticks out is the big 7500$ tax credit banner. People *love* free money and this makes it feel like it is both time-bound and a very good deal, even though teslas are quite expensive.

The product page is also something different. You instantly see the car in a nice view, the most important stats and the option to choose a predetermined model or custom order (on a consumer car?! what?!)

Some of the most common concerns are also instantly addressed on the product page. An image shows the safety structure of the car (americans…), the thousands of charging stations and a visualization of the range

The tesla site is surely a source of inspiration: https://www.tesla.com


Godaddy

Alright, let’s try it for yourself. What design and business choices can you pick out from this website?

https://www.godaddy.com

Others

There are many good other examples of well-executed websites, but showing more would mean a lot of repetition of the previous examples. Do you have site that you really like or has some novel ideas? let me know and i might add it to the article.

Have a good day!
– Luc

An extremely complicated intro to cryptography

Some notes and explanations for crypto. Title is a joke reference to crypto guide titles often containing the word “made Simple”, or “Gentle introduction”.

prereqs:
modulo function (remainder in division)
XOR function (2 bits are the same: 0, 2 bits are different: 1)

Terminology
Ciphertext: Encrypted text, should not give any information about the plaintext.
Plaintext: The original (secret) message. We want to share this without anyone else being able to eavesdrop.
Key: Secret “password”. This is used to encrypt and/or decrypt messages.
Bruteforce: Dumbly trying all possible combinations/keys.
Initialization vector: Initial starting value when doing chained block encryption.


Secret-key/Symetric crypto

Secret-key crypto is the simplest form of encryption. You can use the same key for encryption and decryption. A very simple usable function is the XOR function. This is also called the One time pad or Vernam cipher.

In case of the one-time-pad/xor, One must be very sure to not encrypt two messages with the same key k as that would make an attacker able to derive the original messages by XOR’ing the two ciphertexts

Secret key crypto is hard to use because the key has to be shared between the two parties, without anyone else being having access to it. This is even more cumbersome if the key has to be different for every message sent. You also have to have a secret key shared with every person that you want to communicate with which will mean 0.5 * N * (N-1) keys for N people, which will very, very, very quickly add up.


Public key/Asymetric crypto

Public key cryptography solves this problem by having two types of keys. A public and a private key. The public key can be safely published for anyone to see and the private key is kept only by you. Anyone can encrypt a message with your public key but only you can decrypt it with your private key.

A good way of explaining this to a non-cryptographer is the example where the public key is a lock that you can send to anyone, but only you have the key to. Anyone that wants to send you a message can request a lock (public key) from you, lock it. And then send the locked crate with a secret message knowing that only you can unlock it. An attacker getting their hands on a lock is not a big deal, but getting the key to all the locks is.


Cipher functions (caesar,substitution)

A small step back to the olden days. You might have heard of the caesar cipher where any character is replaced by another character N places further. This N is the key and posession of this N allows for decrypting the message. This decryption was hard when it had to be done by hand but now with computers it is trivial to bruteforce. The class of cipher functions that the the above c’s cipher belongs to is called substitution ciphers. (Simply replacing a character by another according to simple rules)

Stream cipher

A stream cipher is a more generic name for encryption that happens bit-by-bit, take for example XOR’ing with a key that is just repeated over and over. This can end up being relatively slow for larger data sizes.

Block Cipher

Generally an improvement to stream ciphers is to instead use fixed-size blocks. This often means that we have to add some padding to the content to fit the total block size.

The plaintext is split up into parts of block-size b, where if needed this is extended to not have any half-full blocks. (generally with 1000… or 0000..)

Feistel cipher structure

The Feistel cipher is a structure used in for example the old (and unsafe) encryption standard DES. Many modern ciphers are based on the Feistel structure.

Shamelessly stolen from wikipedia

The feistel cipher works by having multiple rounds, where for each round a different (sub)key is used (K0 – Kn). The input is split into two equal parts (Left and Right). Then the encryption function is ran over R0 with Key k0, the result of which is XOR’d with the left plaintext. Then left and right are switched around for the next round. Given enough (even 3-4) rounds the cipher is already a pseudorandom permutation

More info on the wikipedia page, But this is not very important for the basics of cryptography.

Feistel ciphers have a very cool property: If you swap the ciphertext and throw it in the place where the plaintext is normaly put in, it will output the plaintext. Even if F is a hashing function, then it will still be able to decrypt it. This works because the XOR operation is always reversible.

Modes of operation

There are multiple ways to structure the encryption steps of multiple blocks. Here are a couple of the simplest ones, but they might not all be secure. All the methods talked about here are about Symetric-key encryption. (same key for encrypt and decrypt)

ECB

src: Wikipedia
src: wikipedia


Ecb is perhaps the simplest mode of operation, every block is encrypted on its own without any input from other blocks. While this is simple and is good if you don’t want single-bit errors in encrypted blocks to propagate, it is possible to view the old data.

Tux, the cute linux pinguin

CFB

The circled plus sign is the XOR operation

CFB chains the encryption togeter, using the output of the last block’s ciphertext as part of the encryption for the next block. The first encryption step instead uses a randomly generated initialization vector which is just a fancy word for starting value. This should be shared together with the secret key to allow for decryption later on.

This is a lot more secure because patterns in the input will not be discernible from the output anymore. The downside is that errors in one of the encryption steps will propagate throughout the other blocks.

OFB

OFB is very similar to CFB in the way the chaining works, but the output of the encryption step is directly used in the next step without the XOR. This still requires has the advantage of being able to precompute the outputs of the encryption functions when you have the IV and key, and later run the XOR operations when you have the ciphertext which can be done in parallel (fast 🙂 )
The structure is also very resembling of a stream cipher.



RSA

RSA is one of the older and most common examples of asymetric cryptography. You can (should) read more about it here: https://en.wikipedia.org/wiki/RSA_(cryptosystem) Until i get the will to write down more about the math involved… (too many p’s and q’s floating around in my head)



Elliptic curve

This is a more modern approach to cryptography, but sadly also a lot harder to implement. Elliptic curve cryptography is used for example in ed25519 SSH keys.
This post is often seen as a good introduction if you like math (or are just interested and have had enough coffee).

MariaDB clustering with galera

Some notes for myself written down publicly in the hopes that they might help some future person.

It is highly recommended to have the cluster on a vlan/vswitch!
Galera does not implement it’s own security for the cluster ports. See the vlan part at the bottom of this document if you do not have vlans setup yet.

In this example the following IPs are used:
Local machine: 192.168.100.1
Another cluster node: 192.168.100.2

Make sure mariadb is off before starting!

edit /etc/mysql/mariadb.conf.d/60-galera.cnf

[galera]
wsrep_on = ON
wsrep_provider = /usr/lib/galera/libgalera_smm.so
wsrep_cluster_address = "gcomm://192.168.100.1,192.168.100.2"
wsrep_provider_options="gmcast.listen_addr=tcp://192.168.100.1:4567;ist.recv_addr=192.168.100.1:4568"
binlog_format = row
wsrep_cluster_name="galeracluster"
default_storage_engine = InnoDB
wsrep_node_address = 192.168.100.1
innodb_autoinc_lock_mode = 2
wsrep_sst_receive_address=192.168.100.1:4444
bind-address = 192.168.100.1

If this is the first node in the cluster, or all nodes in the cluster are currently down, you have to run
galera_new_cluster to bootstrap the cluster again. Otherwise you can just start the mariadb service through systemd.

Now you’re good to go!
While everything should work now, if you want to read more about galera you can do so here.

Addendum: Vlans

You should keep the cluster on a vlan for security reasons. This can be done in netplan by editing /etc/netplan/(yourconfig).yaml and adding:

  vlans:
    enp0s31f6.4001:
      id: 4001
      link: enp0s31f6
      mtu: 1400
      addresses:
        - 192.168.100.1/24

Here enp0s31f6 is the interface name, 4001 is the vlan id, and 192.168.100.1 is the ip to assign to the local box.

Teaching, Optimized

Bit of an afterthought after seeing patrick winston’s talk

Mainly applicable to middle/high-school and smaller university classes. Depending on your country’s school system.

Minimize Distraction

  • preferably blur out windows at eye height when sitting. Having something going on outside can completely destroy how information flows, however good your teaching is. This is often not something you can do something about as a teacher. But you can try to pick a room with as least distraction as possible.
  • Have students take notes. The error that a lot of lecturers make is that they expect students to make notes by themselves, but if you start without a notebook in front of you you are in no way inclined to grab it when needed. Start the lecture with a short question “does everyone have their books in front of them?” and also give them some time to write down what you say. A subtle clue can also be to say that people to not have to write down x thing, implying that the rest must be written down.
  • Talking people are the death of any lecture. The best way to minimze this is to make sure you are well audible and also to show that you (and all other students) can hear it when a student is talking. Preferably have semi-small groups of no more than ~20 students or a room that is setup such that sound seems amplified, how counter-intuitive it may seem.

Minimize cutoff

  • Never, ever, have a slide full of text or even worse, equations to talk through. People will lose you, and probably wont catch on later either if you take too long. Break problems up into smaller sets, and preferably leave out duplicate information.
  • Give people a chance to jump back on after a while. Give people something to think about for a short time and then start a new part-subject. Make it obvious that you are restarting from scratch so people have a chance to get on.
  • Don’t be boring, but also not too hyper-active. A boring teacher will make his students bored, but a hyper-active teacher will probably also lose out on the energy of students really quickly. A teacher that likes their area is way more likely to generate interested students than one who seems to be bored. (if the teacher is bored then why would you ever want to learn more about it)
  • Be sure of what you are saying. If you have to think a long time of what you are doing people will quickly lose trust, start talking about it, and dropping off.
  • Mainly for math subjects: You do not have to rigorously prove complicated things in your lectures. Students don’t care and they sure as hell don’t have the time to take it all in during the 1 minute that you have your slide up. Students really like to know “why” something is true, but giving intuition is often a thousand times better than plomping down a complicated formula and just reading that off of the slides.
  • Similar to last point. Don’t just show definitions. Show examples. A student cannot remember all your definitions, but they will remember examples. If needed show them many edgecases that implicitly or explicitly explain the definition.
  • Again mainly for math: Where possible, use a blackboard or even digital drawing instead of slides. It is extremely important for students to grasp the steps of solving a problem, and doing it step by step allows you to explain your reasons, allows asking more specific questions from students, and better overall intuition. Blackboards are a lot of work, but what works similarly well is a drawing tablet projected on a screen. You don’t have to lose time and focus when clearing out the board, can go back to show how something correlates to a previous assignment, and share the results for later reference. Do make sure that you have a cheatsheet next to you so you don’t make any big mistakes. Or do and show how students can check themselves when they are solving problems.
  • Lectures often build on eachother, but may be spread out. If you are handling a lot of information in a single lecture then it’s a good idea to have recitations after a day or two to help students remember.

Optimizing tests

  • Tests are not for teaching, please do not include too much new information that might confuse students. Application of the actual course content is of course fine or even imperative.
  • Give everyone their test back after they are graded. Show people where they made errors even if they do not want to redo it. Tests often give a great amount of different topics to see what you know, and re-reading a test is a great way to see what errors you made and where your ideas might differ from the teacher’s. (notationwise, etc). Don’t have students go somewhere to view their test or have them wait a long time. Stimulate them to actually take their time and take away any barriers.
  • Offer a demo-test. This is a great way to prepare students so they have a nice checklist of what to learn and a way to test this. If you think most of your students aren’t making the demo-test include one of the demo-test questions on the real test. This will give them a good reason to use the demo-test when learning and will greatly improve their grades. (don’t re-use more questions, don’t be that guy)
  • Make sure students have enough time. Your goal is to check if the student knows what you told them. Not how fast they can read and write
  • Use a lesson to talk through items of the last test that didn’t go well, and add a small question on the next test with that same item for a couple of points. If you have no reason to want them to still learn it then why were you testing them in the first place?

If you have no reason to want them to still learn it then why were you testing them in the first place?

(Math) Textbooks

Structuri_Algebrice_in_Informatica

The following example is taken from a friend’s mathematical textbook, but any student who has done math at a decent level will recognize the kind of text. It being in romanian only helps prove the point.

Mixing math and normal language is funest for learning. Take mixing french and german in a text. You can’t read it easily even if you know both languages. Let alone remember the content. For most humans, the language processor cannot switch between such different languages efficiently enough (though i’m not aware of any scientific research on this). Books like this want to be mathematically rigorous, but end up being completely useless. If you look at for example MIT’s mathematics for CS book and this the difference is as day and night. MIT has formulas cleanly separated and focussed on their relation in written text, while other books often have this mixmatch of inline unreadable math that makes most people want to stop reading and just watch youtube videos instead.

Asking questions

This is something that many education systems struggle with, and it’s almost become a culture thing in a lot of countries.

Educators often ask themselves why students are not asking questions. Even though it’s obvious from the next test that they didn’t understand everything well. And it’s a sad by-product of mass education.

The amount of questions asked is often already decided in the first lecture by the speed and thoroughness of which an educator anwsers it. If a student has to wait a long time and the subject has already passed, the student (and for one question-asker many others) will already have missed the point, and having to go back feels like you are wasting everyone’s time.

If an explanation is too short, the student might not understand yet or feel dumb, which will stop them from asking more questions. If the explanation is too long the other students will quickly get distracted or the student will feel like it’s a waste of time. Try to keep your answers to questions to the global audience and make them part of your lecture, don’t just look the specific person in the eye and talk softly to them as you will surely lose all other students. The trick here is to gauge what level of understanding all students are on, but this is an extremely hard if not impossible task.

What my university did very well was create a discord group with specific channels for each subject for asking questions. Being able to ask them semi-anonymously (under your real name, but no face) helps a lot with getting people to ask questions, and you have a lot more tools like a LaTeX bot or images. It would help if there was a bit more of a culture of also anwsering those questions. Perhaps by having TA’s anwser some of them, but this fills a very important gap. Somes universities have specific question-hours but these are often not very productive as people are not actively working on the content at those moments. It’s better to allow asking these questions when people are working on the content at school or at home.

Electrical filters

Filters are quite an important application in Electronics. You may want to filter out lower or higher frequencies, or perhaps a band in the middle. This is very important in audio systems (think bass etc), but also in for many other electrical components.

Passive Low pass

Low pass filters simply let the low frequencies “pass” while not letting through higher frequencies.

https://www.electronics-tutorials.ws/filter/filter_2.html

A capacitor has the function

capacitive reactance equation

or 1/(ωC). (ω = 2 * pi * f)

You can see if f increases the bottom part will get larger and as a result Xc will get smaller and smaller. This will make the capacitor have a lower impedance and thus a smaller voltage drop (X = U/I -> U = X * I) which means that Vout will be lower when increasing the frequency. This is the basic idea of a filter.

Gain

low pass filter bode plot

The difference between Vout and Vin is often defined in Gain (dB), you might recognize this scale from how loud sound is, but it is important to know that we are talking about voltages here, not directly sound.

Gain is defined as 20 log(Vout/Vin) where the 20 comes from the 45 degree slope that can be seen at the right of the image. 45 degrees ends up at -20dB/Decade where a decade is a factor 10 on a logarithmic scale.
normal: 1, 10, 100, 100
decade: 1, 2, 3, 4

The cutoff frequency is the frequency at which

Cutoff frequency

The cutoff frequency Fc is the frequency at which the gain is -3db and the slope is exactly 45 degrees. This is is ultimately the magic that decides where your filter should start cutting off.

The cutoff frequency is defined as follows:

Where R is the resistor’s resistance and C is the Capacitance of the Capacitor

You will see this RC factor come back below

Time constant

The time constant t (greek tau) is defined as R * C.

This time constant is an indicator for the time needed to charge the capacitor. To be exact the time from 0 volts to approximately 63.2% (1-1/e) of the value of an applied DC voltage.

Building a boot-sector OS

Many technical people always want to re-invent the wheel. “Operating systems are bloated” and all kinds of that stuff. I challenge you to make your own. (with a bit of help from this tutorial, i won’t leave you hanging 🙂 )

Tools

We are going to build a simle boot sector operating system. This is basically a operating system that fits in the first 512 bytes that the bios initializes when executing code on a disk.

To do this we will need 3 tools:
– NASM: A x86 assembler to turn your assembly language into a binary file.
– QEMU – Quick emulator. This can natively deal with your raw binary files and has text output
– Any text editor, i like nano

On a linux debian-y based os, these can be installed with:
apt -y install nasm qemu-system-x86 nano

Simple boot sector

jmp $   ; jump to current address - infinite loop

times 510-($-$$) db 0
     ; fill the empty space with zeroes

dw 0xaa55            ; write the magic bytes 55 aa at the end of our file.

This is a simple piece of assembly that will jump infinitely to the current address. This will not do much appart from spinning up your fans, but it will be a great start!

The middle line fills the empty space with zeroes. $ is an alias for the current line and $$ is the alias for the start of your program. So with 510-($-$$) you can pad the program to be exactly 510 bytes (and then 2 for the magic bytes following to make our 512) docs

As you can see, comments are added with the ; symbol. This works just like pyhon’s # or javascript’s //. It is smart to add plenty of comments to your assembly code as it will otherwise become quite hard to read.

We have to fill the last 2 bytes with 0xaa55 to show the BIOS that we have an operating system here. It wouldn’t be good if you started booting from your game storage disk would it?

Running our first program

Save your code to a file called boot.asm and then run

nasm boot.asm -f bin -o boot_sect.bin

to assemble boot.asm into a boot_sect.bin that your pc can understand. And then you can use

qemu-system-x86_64 boot_sect.bin -curses

To boot, -curses will emulate the vga text on your screen.

You should see the bios boot screen pop up and see one core shoot to 100% (your jmp loop)

You might not be able to ctrl-c/z out of this. If this is the case you can open a separate shell and kill the qemu process id: kill $(ps -ef | grep curses | awk '{print $2}' | head -n 1)

But thats just a bios boot screen?!

mov ah, 0x0e ; BIOS routine scrolling teletype

mov al, 'H' ; move byte for 'F' into the al register
int 0x10   ; execute the 0x10 "print to screen" interrupt
mov al, 'E'
int 0x10
mov al, 'L'
int 0x10
mov al, 'L'
int 0x10
mov al, 'O'
int 0x10
mov al, '!'
int 0x10



jmp $ ; freeze the screen so we can see our text


times 510-($-$$) db 0



dw 0xaa55 ; magic bytes

The code above will use a BIOS routine to print some text to your screen. Try it yourself and play around a bit with the text.

The code first loads the routine name into the AH register and then the ascii code for the letter in the AL register. Then it calls the 0x10 interrupt which tells the BIOS to use the AH and AL register to print something to the screen. You can find more on registers here. (absolutely worth reading after this article)

And then assemble and run:

Yay!

Logic

Right now we are just manually printing out registers, but it wouldn’t be a computer with some compute. So let’s do a simple calculation

mov ah, 0x0e ; BIOS text typing routine

mov al, '1' ; our text, to make it look better
int 0x10
mov al, ' '
int 0x10
mov al, '+'
int 0x10
mov al, ' '
int 0x10
mov al, '1'
int 0x10
mov al, ' '
int 0x10
mov al, '='
int 0x10
mov al, ' '
int 0x10


; We are upping the ascii code here, not the actual integer
mov cl, 49 ; ascii code for '1', make sure to use another 8 bit register
add cl, 1 ; after '1' comes '2'
mov al, cl ; move to the al register that is used for printing
int 0x10 ; print!


jmp $ ; freeze the screen so we can see our text

times 510-($-$$) db 0

dw 0xaa55 ; magic bytes

The import part here is the block right above the jmp $ instruction.

Old computers use what’s called an “ascii table”, this is basically a mapping between numbers (bytes) and letters.

ASCII Table

As you can see, we start by moving the decimal number 49 into the cl (8 bit) register. In The table this corresponds to a ‘1’. Nasm also lets you move the actual character directly but that does the same thing under the hood.

The cool thing with ascii is that they are just numbers, you can increment 1 and get 2. But you can also increment 57 by one and end up with a “:”.

The order of arguments might be a mit misleading to anyone who has never worked with assembly. Generally we use the first argument as the destination, and the second as source. You can see this quite well in the add and mov instructions.

So let’s compile it and try!

yay!

Functions and loops

(WIP chapter) It would be a lot easier if we could print strings at once. Here’s a function that does just that!

Do not forget the org symbol, this mentions where the code/data of your program is located so you can use indirect addressing with square brackets.

Caveat: In This way of recursively calling yourself, the stack will not be restored when the null byte is called. How do we fix this?

[org 0x7c00]

mov ah, 0x0e ; BIOS text typing routine

mov bx, str
call printstring

freeze:
  jmp $ ; freeze the screen so we can see our text


printstring:
  pusha
  cmp byte [bx], 0 ; Stop when we see a null byte
  je freeze ; pop back variables when we've had all letters

  mov al, [bx] ; move contents of bx register to al register
  int 0x10 ; print!

  add bx, 1 ; move to the next letter

  call printstring ; recursively print all letters


str:
  db 'Hello, Wqrld!',0 ; terminate with \0


times 510-($-$$) db 0

dw 0xaa55 ; magic bytes
You can now print strings

And onwards…

You did it, you have now made your own very very very basic OS. The rest is up to you 😉

Just joking, you can read more awesome stuff at https://www.cs.bham.ac.uk/~exr/lectures/opsys/10_11/lectures/os-dev.pdf

Learn by porting

People have different ways of learning a new ecosystem or library. Some prefer to just start and look at the docs as they go, some like to read through all the examples that can be found online. I would like to add one thing to that list: Learning by porting an old program over to the newest version of your ecosystem.

But, what?

Programs get made and abandoned. Updates happen and things break. An old piece of software made for the same ecosystem that you are learning right now might not (won’t?) just compile.

The process of changing a piece of software made for one ecosystem to a similar but different one is called porting. This is often not as easy as it seems, even less so in low-level languages like C. (and if you go low enough, even C will seem “high-level”)

Why porting

When porting over a piece of software, you generally have to have a decent understanding of the software and ecosystem you are working with. When starting out you won’t have that.

But. When porting over a piece of software you will learn a tremendous amount about the tooling and common setups. When googling errors (which will absolutely come up) you will find the appropriate documentation and forums for your piece of software, and you will often find many common mistakes that you will now not have to make. An added bonus is that you will learn about both the old, and the new ecosystem. Learning the new ecosystem is obvious, but learning about the old ecosystem will also help a lot when you are working with older blog posts and pieces of software that you may or may not want to “steal” some code from.

As an added bonus, the open-source community will have a new updated version of the piece of software that you have worked on. Great!

DPDK

As an example i have recently been busy porting an old DPDK program over to a newer version of the library. I wanted to try and start by just writing out a basic program that would let me do what i wanted, but i immediately ran into trouble with outdated docs and undocumented missing libraries. After i got a simple version working i decided to try and port over an old program that i found on github. By doing this i was able to easily discern old functionality and the accompanying new improved version that i could also better understand the outdated documentation and use it to write newer programs.

Okay okay, i’ll give it a try.

This might not work for everyone, but i highly recommend you to try this method and see if it works for you. Good luck learning!

What every dev should know about web security: Hashing

I’m always suprised how little most backend web devs don’t know about basic security measures. These are a couple ones i feel like every web dev must know how about:

word list:
plaintext: normal human-readable text
private key: a random string used in encryption
encryption: using a private key to turn plaintext into something that can only read by others with that same key.
hashing function: A cryptographic function that takes some information and outputs a hash. You might have heard of md5, sha256, or NTLM.

Password Hashing

You should absolutely never store plaintext passwords in your database. DBs get hacked, staff has to look in the db for maintenance, people re-use their passwords everywhere.

You could encrypt the password using a private key, but anyone with the private key can still easily decrypt and gain access to the passwords.

A better way is to use a technique called hashing. See this as one-way encryption. You can hash a plaintext password and check it to the hash stored in your database, but it can never be turned back into the original password. That information in just “lost”.

UsernameSalt valueString to be hashedHashed value = SHA256 (Password + Salt value)
user1E1F53135E559C253password123E1F53135E559C25372AE25495A7981C40622D49F9A52E4F1565C90F048F59027BD9C8C8900D5C3D8
user284B03D034B409D4Epassword12384B03D034B409D4EB4B6603ABC670967E99C7E7F1389E40CD16E78AD38EB1468EC2AA1E62B8BED3A
Source: wikipedia


People then often start asking how in the world you are supposed to check if the password that the user supplied is correct. The anwser is simple: Just hash the user supplied password again and see if it matches the hash that was created during registration. As long as the input hash function, (salt), and password matches, they will always return the same output hash.

Think about it in the following way: theres multiple passwords that make for the same hash. The chance of this happening on accident or on purpose are extremely low, but still possible. thus making hashes more secure. E.g. hash("mycoolpassword") could theoretically equal hash("Thisisaverystrongpassword"). Since hashes are generally made to be quite “resource-intensive” it is not vible for an attacker to try every single password in existence, although this has been done for every combination of low-character-count passwords. These are called rainbow tables and can be downloaded online with filesizes in the terrabytes.

Salting and Peppers

A solution to these rainbow tables is called salting:

You can add a small string that is unique to every user (you might use their username, but its better to use something randomly generated) and use that in hashing: hash(password + salt). As long as this salt is unique to your application/user it won’t be as trivial to just lookup the resulting hash online.

Some applications also use something called a pepper, which is a application-wide salt that gets added to the per-user salt. This will make it even harder to crack if the attacker has both the hash and the salt, but it is generally seen as overkill.

Conclusion

Hashing is very important for any application that handles passwords. There are many more techniques that you should read about like CSRF, XSS, SQL injection. But i will leave them for what they are for now.

Statistics basic: stddev and z-score

I’ve been trying to wrap my head around some statistics/data science used for dissecting ddos attacks, and came across a couple of new topics that are quite important but rarely explained.

Sources

https://www.wiskunde.net/standaarddeviatie

Standard deviation

Standard deviation is a property of a set that describes the spread around the mean.

Sx = σ = de standard deviation of the set
Xi = The number i in the set.
Xgem = the mean of the set
Nx = the total number of elements in the set

σ = Sx = √( ∑ ( (xi – xgem)2 / nx) )

Standaarddeviatie
SRC: wiskunde.net

Z-score

z-score: easy normalized way of seeing if something is above the average or below, and if it is an outlier (z-score >3 | <3 is often seen as a outlier)

 Z = frac{X - mu}{sigma}.
SRC: statistiekbegleider

mean = average
Z-score = (Measurement – mean) / stddev

In python:

df['zscore'] = ((df['count'] - df['count'].mean()) / df['count'].std(ddof=0)).round().fillna(NONE)

Extra: Newton Binomial

{\displaystyle {n \choose k}={\frac {n!}{k!(n-k)!}}}

if we take n = 10 and k = 3 (also called 10 choose 3). We will find the outcome to be 120.

The newton Binomial is used to find the number of ways to choose k (three) elements out of n (10). Take for example the amount of combinations of toppings you can choose on a pizza when you can choose at most 3 from a total pool of 10 options.

Getting, and keeping players

note beforehand: Most of these ideas can be related to more common marketing tactics. A lot of this comes down to having players *trust* your server.

Keeping players – short term

  • Don’t overload them with information. People are here to play and not to read walls of signs. This will already drop their enthousiasm for a new server.
  • Have players online. Nothing screams run more than an empty server. This is ofcourse a catch-22. But a very important one to think about when for example making rules against afking.
  • Promote welcoming new players. This can absolutely make someone enthousiastic about finding a good community and will improve their chance of staying by a ton. You can start this by just welcoming new players when they join and you or another staff member are online. When other players see this they will copy the behaviour. You can also add certain funny welcoming commands with colors or send players a random “+1 karma” message.
  • Show people that you don’t have a reset coming any time soon. This is one of the most important things you can do for a lot of servers if you want to have long-term players.

Keeping players – long term

This one is harder, and also largely depends on how active your playerbase is.

  • Always. Always have something for your players to do. Something grindable is great for this (think walls for factions or farms), But you can also promote building cities or other bigger builds.
  • Events are a great way to pull many players at once including ones that initially stopped playing
  • Know how to handle griefing. Preferably without staff intervention. If players know beforehand that their building will stay up long-term they will invest more time in it. And in addition to that you won’t have anyone quitting over a griefed project that they put so many hours into. Rollbacking might be an option but sometimes people decide to quit before asking a staff member if something can be rolled back.
  • Make sure the difficulty scales well with playtime. This one is hard to get right but having a good start and then keeping some sort of difficulty when you go higher up in the game makes it entertaining for a longer time. While a lot of people like grinding there should still be challenges for them to tackle that aren’t too easy or too hard. Don’t make challenges that take a month to complete as people are not ready to invest that much time without some reward inbetween.