Python Forensics
As the name suggests, forensics is the process of analyzing and investigating the data. Wondering how Python helps in forensics?
Python helps in doing these operations on digital data. We will cover different operations and corresponding modules provided by Python for these purposes. Let us start with the introduction to forensics.
What is Forensics in Python?
Nowadays we all spend most of our time on the internet. Things many of us try to avoid and protect ourselves from are cyber-attacks. With the increase in these cyber crooks, there comes a need for investigation of the problem created and to get the evidence against them.
Cyber Forensics deals with these problems and Python provides different modules for different applications of forensics. It helps in gathering the information including digital data, evidence, and password.
Computational forensics is part of the study of cyber forensics that deals with digital evidence. It helps in pattern based evidence, such as fingerprints, tool marks, documents. And other patterns include physiological and behavioral patterns such as digital evidence, DNA, and crime scenes. In addition to these, there are algorithms that help in data mining, computer graphics, machine learning, computer vision data visualization, and statistical pattern recognition.
Before going into the coding part, let us see some of the naming conventions in Python forensics.
Naming Conventions in Python Forensics
Naming Convention | Example | |
Local variables | Should be in camelCase and the underscore is optional | rollNumber |
Constant | Should be in uppercase and the words should be separated by underscores | ROLL_NUMBER |
Global variable | The prefix should be _gl and in camelCase. Underscore is optional | gl_rollNumber |
Function | Should be in pascal case and active voice. The underscore is optional | MyRollNumber |
Module | Should start with _ and in camelCase | _rollNumber |
Class | The prefix should be _class and in the Pascal case. Better to keep it short | class_RollNumber |
Object | The prefix schedule start with ob_ and should be in camelcase | ob_rollNumber |
Hash Function in Python
This function is used for cryptography. It takes a large amount of data as an input and generates encrypted value. This encrypted message is unique to the data and has the same value. So, once the data is mapped it is not possible to revert back. This algorithm comes of use in situations when we need to encrypt passwords, files, or any other digital data. Let us see an example.
Example of Python hash function:
import hashlib import uuid def hash_func(password): id = uuid.uuid4().hex #generating a 128 bit hexadecimal universal unique id for the password ''' encoding the id and the password using UTF-8 encrypting these encoded values and concatenating them using hexdigest() Adding the obtained result to the id Returning the whole string ''' return hashlib.sha256(id.encode() + password.encode()).hexdigest() + ':' + id def check_password(hashed_password, user_password): password, s = hashed_password.split(':') #splitting the password into encrypted string and uuid ''' checking if the encrypted message format of the password is equal to the encryption of uuid and password Then returning either True or False ''' return password == hashlib.sha256(s.encode() + user_password.encode()).hexdigest() new_pass= input('Enter the old password :') #Taking the input of the password hashed_pass = hash_func(new_pass) #encrypting the password #printing the encrypted password print('The hash string to store in the database is: ', hashed_pass)
Output:
The hash string to store in the database is: 937219369c4e241b762f55c13fe7bd0e5c44a343ad84818a123e72762f8d9106:8b1966c93b4b41938933b635b9ac133f
Now let us end the password again and check if the entered password is correct or not using the check_password() function.
Example of Python hash function:
old_pass=input('Enter the password: ') check_password(hashed_pass,old_pass)
Output:
Enter the password: abc@123
True
Example of Python hash function:
old_pass=input('Enter the password: ') check_password(hashed_pass,old_pass)
Output:
False
We can see that when we gave the wrong password, we got the output as False. From above, we can observe that:
1. We can easily compute the hash value for an input
2. It is impractical to generate the original value from a given hash value
3. It is impractical to find two different values with the same hash value
4. It is impractical to modify the original input without changing the hash value
Cracking the Encryption in Python
During the analysis to find the evidence, we must crack the data we get. We will be discussing the process in this section. Before going into code, let us learn more about cryptography.
The original message will be in human-readable form and it is called plain text. Whereas, a ciphertext is an encrypted form of the pain text. For example, we take each letter in the plain text and shift it back by 2 places in the alphabet. It will turn each A to a Y, each B to an X, and so on. Like this, there are many types of encryption patterns and these can be broadly classified into:
1. Tire Tracks and Marks
2. Impressions
3. Fingerprints
Let us see an example of cracking the vector data.
Example of decrypting:
import sys def decryption(shift ,cipher): text='' for each in cipher: x = (ord(each)-shift) % 126 # getting the ASCII value of the character and changing the value if x < 32: x+=95 text += chr(x) #converting the ASCII value to a new character print(text) cipherText = input('Enter the message: ') for i in range(1,95,1): decryption(i,cipherText)
Output:
~
}
|
{
z
y
x
w
v
u
t
s
r
q
p
o
n
m
l
k
j
i
h
g
f
e
d
c
b
a
a~
`
`}
_
_|
{
z
y
x
w
w~
w~~
v
v}
v}}
u
u|
u||
t
t{
t{{
t{{~
s
sz
szz
szz}
r
ry
ryy
ryy|
q
qx
qxx
qxx{
p
pw
pww
pwwz
o
ov
ovv
ovvy
n
nu
nuu
nuux
m
mt
mtt
mttw
l
ls
lss
lssv
k
kr
krr
krru
j
jq
jqq
jqqt
i
ip
ipp
ipps
h
ho
hoo
hoor
g
gn
gnn
gnnq
f
fm
fmm
fmmp
Python Virtualization
When we try to imitate the work of the IT system such as workstations, network, storage, etc., then this process is called virtualization. And to do an emulation of the virtual hardware, we use the hypervisor.
The process of virtualization gives the following benefits to the forensics:
1. The workstation, in a valid state, helps in the investigation.
2. We can recover the deleted data by including dd images of a drive as a secondary drive on the virtual machine.
3. The virtual machine can be used to recover the gathered evidence.
Now let us see the steps to create a virtual machine in Python.
1. Let the virtual machine name be vm_demo. And let us give the machine memory of 522 MB.
vm_demo_memory = 512 * 1024 * 1024
2. Now, we need to attach the machine to a cluster.
vm_demo_cluster=api.clusters.get(name = "Default")
3. And now, we boot the machine from the virtual HDD
vm_os = params.OperatingSystem(boot = [params.Boot(dev = "hd")])
Now we will combine the above steps to form a virtual machine:
Example of virtualization in Python:
from ovirtsdk.api import API from ovirtsdk.xml import params try: #Trying to get the API credentials required for the virtual machine api = API(url = "https://HOST", username = "ABC", password = "abc@123", ca_file = "ca.crt") vm_name = "vm_demo"#name of the machine vm_memory = 512 * 1024 * 1024 #creating 512MB memory in bytes vm_cluster = api.clusters.get(name = "Default") #attaching the cluster vm_template = api.templates.get(name = "Blank") #adding the template #assigning parameters to the operating system vm_os = params.OperatingSystem(boot = [params.Boot(dev = "hd")]) vm_params = params.VM(name = vm_name, memory = vm_memory, cluster = vm_cluster, template = vm_template, os = vm_os) try: api.vms.add(vm = vm_params) #trying to add the machine print("Virtual machine '%s' added." % vm_name) #gives the output if it is successful except Exception as ex: print("Adding virtual machine '%s' failed: %s" % (vm_name, ex)) #gives exception, if any api.disconnect() except Exception as ex: print(f"Exception occured: {ex}" ) #gives exception, if any
Output:
Python Network Forensics
In modern days, network forensics environment investing comes across many difficulties. These problems can be used for a contravention report, assessments execution, about susceptibility, or validation of regularity of the compliances. Let’s see some terminology related to network forensics.
1. Client – The part of the client-server architecture that runs personal computers and workstations.
2. Server – It is a part of the client-server architecture that executes the client’s request.
3. Protocols – The set of rules that must be followed while data transfer between the client and the server.
4. Websockets – WebSockets are protocols that provide rules for communication over the TCP connection. It allows bi-direction messages.
With the help of these protocols, we can authenticate, send or received information from third-party users. But, we need to encrypt for a secure channel.
Example of network forensics in Python:
import socket # creating a socket object instance sock_obj = socket.socket(socket.AF_INET, socket.SOCK_STREAM) # getting local host name host = socket.gethostname() port = 8080 # forming connection to hostname on the specified port. sock_obj.connect((host, port)) # Setting upperlimit for the number of bytes to be recieved temp = sock_obj.recv(1024) print("Waiting for the connection") sock.close()
Output:
Scapy and Dshell in Python
1. Python Dshell:
This is a network forensic analysis Python toolkit. It was developed by US Army Research Laboratory and released open-source in 2014. It makes the forensic investigation easy.
Dshell has the following decoders:
a. dns: It extracts the DNS-related queries
b. reservedips: It helps in identifying solutions for the DNS problems
c. large-flows: It lists net flows
d. rip-http: It extracts the files from HTTP traffic
e. Protocols: It helps in identifying non-standard protocols
2. Python Scapy:
It is also a Python-based tool that analyzes and manipulates network traffic. We can analyze packet manipulation, capture the packets of a wide number of protocols and decode them. It gives a detailed report about network traffic.
Example of scapy in Python:
#Importing scapy and GeoIP toolkit import scapy, GeoIP from scapy import * geoIp = GeoIP.new(GeoIP.GEOIP_MEMORY_CACHE) def locatePackage(package): # extracts the IP address of source src_add = package.getlayer(IP).src # extracts the IP address of destination dest_add = package.getlayer(IP).dst # getting the country details of source src_country = geoIp.country_code_by_addr(src_add) # getting the country details of destination dest_country = geoIp.country_code_by_addr(dest_add) print ("source"+sourceCountry+'->'+"destination"+dest_country)
Output:
Searching in Python
Searching is the most common and important part of forensics. Keyword searching helps in finding strong evidence. One needs experience and knowledge to get the information from the deleted messages.
Python provides various built-in modules for the search operation. The investigator can get the results using the keywords such as “who”, “what”, “where”, “when”, “which”, etc.
Example of searching:
str1 = "Searching for the evidence by searching keyword" str2 = "evidence" str3="Python" #searches for the str2 in the str1. If found gives the index of first occurence else it gives -1 print(str1.find(str2)) print(str1.find(str3)) print(str1.find(str2, 10)) #setting the start index from where the search needs to be done as 10
Output:
-1
18
Indexing in python
Indexing is a feature that can be used to get potential evidence from the files. If the evidence can be restricted within the memory snapshot, a disk image, a file, or a network trace, it reduces the time for searching. The indexing can also be used to find the keywords.
Example of indexing:
list1 = [123, 'Photos', 'forensics', 'details'] print("Index of forensics : ", list1.index('forensics')) #finding the index of forensics print("Index for 123 : ", list1.index(123)) #finding the index of 123 str1 = "Searching for the evidence by searching keyword" str2 = "evidence" print("Index of the evidence is: ",str1.index(str2))
Output:
Index for 123 : 0
Index of the evidence is: 18
Image Library in python
We deal with different forms of data ranging from simple data structures to complex images. Python provides the library named PIL for handling the information stored in the images.It also supports file formats, graphics and also includes powerful image processing tools.
Extracting information from the images involves the following steps:
1. Importing the images
2. Extracting data using PIL
3. Showing the data in array form
4. Arrange this data for the analysis
Example of image forensics:
from PIL import Image image = Image.open('D:\coins.jpg', 'r') pixel_val = list(image.getdata()) flat_pixels = [x for sets in pixel_val for x in sets] print(max(flat_pixels)) #finding maximum value of the pixel in the image
Output:
Mobile Forensics in Python
We can use the non standard form of information as evidence. Nowadays, smartphones are widely used in investigation, processed. Phone calls, photos, smartphones, and messages are considered as evidence.
The android smartphones either use the PIN or alphanumeric password. The password length lies between 4 and 16 digits/characters range. We will see an example to get through the lock screen to extract data. The password gets stored inside a file password.key in /data/system in the form of SHA1-hashsum and MD5-hashsum.
Example of mobile forensics:
public byte[] passwordToHash(String password) { if (password == null) { return null; } String algo = null; byte[] hashed_pass = null; try { byte[] salted_password = (password + getSalt()).getBytes(); byte[] sha1 = MessageDigest.getInstance(algo = "SHA-1").digest(salted_password); byte[] md5 = MessageDigest.getInstance(algo = "MD5").digest(salted_password); hashed_pass = (toHex(sha1) + toHex(md5)).getBytes(); } catch (NoSuchAlgorithmException e) { Log.w(TAG, "Failed to encode string because of missing algorithm: " + algo); } return hashed; }
Quiz on Python Forensics
Conclusion
Finally, we are done with the forensics in Python. Hoping that you learned some new concepts reading this article. Happy learning!
how are you buddy