Hello my fellow noob coder! If you have ideas but you are not sure how to turn them into code then this tutorial is for you.
So the thing is that I keep encountering encoded text and normally I use a browser plugin named HackBar to decode them and it works fine. But hack bar’s capabilities are limited, like it can’t decode binary, FromChar encoding etc.
And moreover, I always wanted a program which can decode Caesar based ciphers for me and unfortunately there was no such program so I created to code my own and this is my journey that you are about to read.
Oh wait…lemme show you how Decodify looks:
It may see alien thing to you but you will know everything till the end.
My first aim was to create a program to decode ROT/Caesar based ciphers. For example, lets say that ujkv is a ROT2 encoded string, that means the alphabets have a offset of 2 which means a is replaced by c, b is replaced d and so on…
So the decoded form of ujkv will be shit.
You must be thinking that its simple, we just have to subtract the offset from the given string but alphabets are alphabets, they are not numbers and hence we can’t apply mathematical operations on them.
Well the solution was simple. Assign a number to each alphabet like 1 to a, 2 to b and…..26 to z. Then detect what alphabet and offset user entered, lets say user entered u and 2 offset so our program would need to detect that user entered u, now we already assigned a number to u i.e. 21 so now we just have to subtract the offset from it i.e. 21 – 2 = 19. No we have a new number, and we have to see which alphabet has assigned a value of 19 to it, thats right! Its s! So we just figured out that u with a offset of 2 means s.
The Actual Code
Wew! Even idiots can talk but turning a solution into lines of code is the real deal. So lets do this!
First of all, we will define a list containing all alphabets
li = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']
Now lets write the code which will find out what alphabet has entered.
alphabet = raw_input('Enter an alphabet: ') number = 0 for replace in li: if alphabet == replace: break else: number = number + 1
Lets break down things:
- alphabet is a variable which will store the value of user input
- number is the variable which currently has a value of 0 but we will use it store the position of alphabet in the list of alphabets we defined earlier i.e. li.
- In the next line, we are iterating over the list li. Thats a for loop in simple words. The variable replace contains the value of an element from the list li.
- After that, we are checking if the user input alphabet is equal to the element of list li. Why? Just a minute, I will explain everything.
- The loop will break if alphabet is equal to replace. In simple words, the loop will break if we find a match of user input in our alphabet list.
- else? Well if the user input doesn’t match the current element of the alphabet list, we will add 1 to the variable number.
So what this code really does?
Well initially the variable number = 0.
Now lets say user has entered c, so we will take out first element from the alphabet list i.e. a and will compare it with c. So….is a equal to c? No? Cool, if they do not match we will add 1 to value of number so it will become number = 1. Then we continue the loop and pick the next element in the list i.e. b. So….is b equal to c? No? Cool, we will add 1 again to the value of number so it becomes number = 2. Now lets continue the loop and pick the next element from the list i.e. c. So…is c equal to c? Yes it is! We will break the loop now. So whats the value of number at this point? It is 2. What is the position of c in alphabets? a…b…c…Hmm…its 3 right? So why defined number = 0 in the start instead of number = 1. Well it is because positions start with 0 in lists.
So if we do
it will print c and not b. So where were we? Yeah I was saying that this code will help us to find the position of alphabet entered by the user in the list of alphabets i.e. li.
But there’s a problem. What if user enters S instead of s? Moreover, what if user enters 4 or @? Our list doesn’t contain any of this, so we have to work around that. Lets change the code a bit.
alphabet = raw_input('Enter an alphabet: ').lower() number = 0 match = search(r'[a-z]', alphabet) if match: for replace in li: if alphabet == replace: break else: number = number + 1 else: pass
First of all, I added .lower() at the end of raw_input() so we can convert the input to lowercase. After that, we added a new thing i.e. search(r'[a-z]’, alphabet)
Well, this is something we call RegEx. It is used to find patterns in strings. [a-z] is the regex pattern which will help us to check if the alphabet belongs to the pattern a, b, c…..z
If a match is found, true is returned. If it doesn’t match the regex pattern, false is returned. After that, we used if match: well that means, we want to do something if the match has true as its value. Why are we doing that? Well lets say user enters 6, the program would have compared it against the elements of alphabet list li and that would waste a lot of time. So we are making sure that the input is actually an alphabet before finding its position in the alphabet list. Else if the input doesn’t belong to a-z we just pass it which means doing nothing.
Phewww….everything seems sorted out now. Well our program is able to find the position of a single alphabet, but we want it to find position of every letter in a string. So we will add more lines to the code. Lets get going!
rotable = raw_input('Enter your string: ').lower() rotable = list(rotable) for char in rotable: number = 0 match = search(r'[a-z]', char) if match: for replace in li: if char == replace: break else: number = number + 1 else: pass
Ahaan! There are some noticeable changes. Lets break down the changes and additions.
- We are not assigning user input to the variable alphabet anymore because now we are taking strings as input. New variable name is rotable. Weird name? Well its my code, I can do whatever the fuck I want to.
- There’s a line, rotable = list(rotable), well rotable holds the string entered by the user and list(rotable) will convert it to a list. So now rotable is a list whose elements are the characters of string entered by the user. For example, if the user entered orange, value of rotable will be rotable = [‘o’, ‘r’, ‘a’, ‘n’, ‘g’, ‘e’]
- So now as the rotable is a list we can use its elements separately so we will create a for loop as for char in rotable.
Well nothing much is changed after that, we take the first element of the list rotable and do that all comparing shit and after that we take the second element and do the same stuff.
Congrats! We made it! We can finally take a string as input and find the position of characters present in that. Now lets take a look at the aim two!
We have the position of each of the alphabets present in the string entered by the user and now all we have to is to subtract the offset from their position. Like the user will enter a string and say, its encoded with ROT13 so our program will find the position of the alphabets of the string and then subtract 13 from it and print whatever alphabet holds this new value as position. Like if the user entered s, we know its position is 19 *coughs* i mean its position is 18 as I told you, lists start from 0 and not 1. So the s has a positional value of 18 and if subtract 13 from it, we will get 18 – 13 = 5. Now which alphabet has a positional value of 5?
a b c d e f g h
0 1 2 3 4 5 6 7
Ummm f has a value of 5. So the decoded form of s is f! Thats pretty cool. Thats all we have to do, get input from user, find the positions, subtract the offset and print new rotated alphabets.
new_position = real_position - offset print li[new_position]
This code seems right to you? Well for those who don’t understand what it does, don’t sweat out. Let me explain it to you.
We are subtracting the offset from real position to get new position and then we are printing the element from the list li which is at that position.
Now lets implement it,
li = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z'] rotated =  rotable = raw_input('Enter your string: ').lower() offset = raw_input('Enter the offset: ') def rotx(number, offset): replace = number - int(offset) rotated.append(li[replace]) def find(rotable, offset): rotable = list(rotable) for char in rotable: number = 0 match = search(r'[a-z]', char) if match: for replace in li: if char == replace: rotx(number, offset) break else: number = number + 1 else: rotated.append(char) find(rotable, offset) print ''.join(rotated)
So I created two functions, find() and rotx().
- find() does what old code was doing.
- rotx() is the function which we will be using to subtract offset and all that.
Want me to break down the changes? Alright.
- We defined a new list rotated which is currently blank but we will store all the processed characters here. By processed I mean those characters which we will get after subtracting offset.
- We are now asking user to enter the value of offset.
- rotx() function needs two arguments, number and offset. Where number is the original position of the alphabet while offset is the value of offset which we are subtracting from the number in the very first line of the rotx() function.
- rotated.append(li[replace]) might be confusing for you. Well as you know, rotated.append() is being used to add a value to the list name rotated. But whats up with li[replace]? Well lets say, the original position was 19 and the offset was 9, hence the value of replace will be 19 – 9 = 10. Now we are doing li[replace] or you can say li which will extract the 10th *coughs* I mean the 11th element from the list li. And with the append statement, we will add the 11th element to the new list i.e. rotated.
- There’s a change in the last line of find() function. If the character is not an alphabet, we just append it to the rotated list and will print it as it is.
- Last thing that I think I should explain is print ”.join(rotated). Well as I have mentioned above, the list rotated contains the characters which has been processed or you can say decoded. So ”.join(rotated) will join all the elements. For example, rotated = [‘d’, ‘a’, ‘m’, ‘n’] will become a string damn. Easy right?
Well thats all! In next chapter, we will be coding it further. Till then, Keep Learning! Keep Coding!