Python Forensics

As the name suggests, forensics is the process of analyzing and investigating the data. Wondering how Python helps in forensics?

Python helps in doing these operations on digital data. We will cover different operations and corresponding modules provided by Python for these purposes. Let us start with the introduction to forensics.

What is Forensics in Python?

Nowadays we all spend most of our time on the internet. Things many of us try to avoid and protect ourselves from are cyber-attacks. With the increase in these cyber crooks, there comes a need for investigation of the problem created and to get the evidence against them.

Cyber Forensics deals with these problems and Python provides different modules for different applications of forensics. It helps in gathering the information including digital data, evidence, and password.

Computational forensics is part of the study of cyber forensics that deals with digital evidence. It helps in pattern based evidence, such as fingerprints, tool marks, documents. And other patterns include physiological and behavioral patterns such as digital evidence, DNA, and crime scenes. In addition to these, there are algorithms that help in data mining, computer graphics, machine learning, computer vision data visualization, and statistical pattern recognition.

Before going into the coding part, let us see some of the naming conventions in Python forensics.

Naming Conventions in Python Forensics

Naming Convention Example
Local variables Should be in camelCase and the underscore is optional  rollNumber
Constant Should be in uppercase and the words should be separated by underscores ROLL_NUMBER
Global variable The prefix should be _gl and in camelCase. Underscore is optional  gl_rollNumber
Function Should be in pascal case and active voice. The underscore is optional  MyRollNumber
Module Should start with _ and  in camelCase  _rollNumber
Class The prefix should be _class and in the Pascal case. Better to keep it short class_RollNumber
Object The prefix schedule start with ob_ and should be in camelcase ob_rollNumber

Hash Function in Python

This function is used for cryptography. It takes a large amount of data as an input and generates encrypted value. This encrypted message is unique to the data and has the same value. So, once the data is mapped it is not possible to revert back. This algorithm comes of use in situations when we need to encrypt passwords, files, or any other digital data. Let us see an example.

Example of Python hash function:

import hashlib   
import uuid  
  
def hash_func(password):  
    id = uuid.uuid4().hex  #generating a 128 bit hexadecimal universal unique id for the password
    '''
    encoding the id and the password using UTF-8
    encrypting these encoded values and concatenating them using hexdigest()
    Adding the obtained result to the id 
    Returning the whole string
    '''
    return hashlib.sha256(id.encode() + password.encode()).hexdigest() + ':' + id


def check_password(hashed_password, user_password):  
    
    password, s = hashed_password.split(':')  #splitting the password into encrypted string and uuid
    '''
    checking if the encrypted message format of the password is equal to the encryption 
    of uuid and password 
    Then returning either True or False
    
    '''
    return password == hashlib.sha256(s.encode() + user_password.encode()).hexdigest()  

new_pass= input('Enter the old password :')   #Taking the input of the password
  
hashed_pass = hash_func(new_pass)  #encrypting the password
#printing the encrypted password
print('The hash string to store in the database is: ', hashed_pass)     

Output:

Enter the old password :abc@123
The hash string to store in the database is: 937219369c4e241b762f55c13fe7bd0e5c44a343ad84818a123e72762f8d9106:8b1966c93b4b41938933b635b9ac133f

Now let us end the password again and check if the entered password is correct or not using the check_password() function.

Example of Python hash function:

old_pass=input('Enter the password: ')
check_password(hashed_pass,old_pass)

Output:

Enter the password: abc@123

True

Example of Python hash function:

old_pass=input('Enter the password: ')
check_password(hashed_pass,old_pass)

Output:

Enter the password: abc@23
False

We can see that when we gave the wrong password, we got the output as False. From above, we can observe that:
1. We can easily compute the hash value for an input
2. It is impractical to generate the original value from a given hash value
3. It is impractical to find two different values with the same hash value
4. It is impractical to modify the original input without changing the hash value

Cracking the Encryption in Python

During the analysis to find the evidence, we must crack the data we get. We will be discussing the process in this section. Before going into code, let us learn more about cryptography.

The original message will be in human-readable form and it is called plain text. Whereas, a ciphertext is an encrypted form of the pain text. For example, we take each letter in the plain text and shift it back by 2 places in the alphabet. It will turn each A to a Y, each B to an X, and so on. Like this, there are many types of encryption patterns and these can be broadly classified into:

1. Tire Tracks and Marks
2. Impressions
3. Fingerprints
Let us see an example of cracking the vector data.

Example of decrypting:

import sys  

def decryption(shift ,cipher):  
    text=''  
    for each in cipher:  
        x = (ord(each)-shift) % 126  # getting the ASCII value of the character and changing the value
        if x < 32:  
            x+=95  
            text += chr(x)  #converting the ASCII value to a new character
            print(text) 
            
cipherText = input('Enter the message: ')  
for i in range(1,95,1):  
    decryption(i,cipherText)  

Output:

Enter the message: Hello
~
}
|
{
z
y
x
w
v
u
t
s
r
q
p
o
n
m
l
k
j
i
h
g
f
e
d
c
b
a
a~
`
`}
_
_|
{
z
y
x
w
w~
w~~
v
v}
v}}
u
u|
u||
t
t{
t{{
t{{~
s
sz
szz
szz}
r
ry
ryy
ryy|
q
qx
qxx
qxx{
p
pw
pww
pwwz
o
ov
ovv
ovvy
n
nu
nuu
nuux
m
mt
mtt
mttw
l
ls
lss
lssv
k
kr
krr
krru
j
jq
jqq
jqqt
i
ip
ipp
ipps
h
ho
hoo
hoor
g
gn
gnn
gnnq
f
fm
fmm
fmmp

Python Virtualization

When we try to imitate the work of the IT system such as workstations, network, storage, etc., then this process is called virtualization. And to do an emulation of the virtual hardware, we use the hypervisor.

The process of virtualization gives the following benefits to the forensics:

1. The workstation, in a valid state, helps in the investigation.
2. We can recover the deleted data by including dd images of a drive as a secondary drive on the virtual machine.
3. The virtual machine can be used to recover the gathered evidence.

Now let us see the steps to create a virtual machine in Python.

1. Let the virtual machine name be vm_demo. And let us give the machine memory of 522 MB.

vm_demo_memory = 512 * 1024 * 1024

2. Now, we need to attach the machine to a cluster.

vm_demo_cluster=api.clusters.get(name = "Default")  

3. And now, we boot the machine from the virtual HDD

vm_os = params.OperatingSystem(boot = [params.Boot(dev = "hd")])  

Now we will combine the above steps to form a virtual machine:

Example of virtualization in Python:

from ovirtsdk.api import API 
from ovirtsdk.xml import params

try: #Trying to get the API credentials required for the virtual machine
    api = API(url = "https://HOST", 
      username = "ABC", 
      password = "abc@123", 
      ca_file = "ca.crt")

    vm_name = "vm_demo"#name of the machine
    vm_memory = 512 * 1024 * 1024 #creating 512MB  memory in bytes
    vm_cluster = api.clusters.get(name = "Default") #attaching the cluster
    vm_template = api.templates.get(name = "Blank") #adding the template

    #assigning parameters to the operating system
    vm_os = params.OperatingSystem(boot = [params.Boot(dev = "hd")])

    vm_params = params.VM(name = vm_name,
      memory = vm_memory,
      cluster = vm_cluster,
      template = vm_template,
      os = vm_os)

    try: 
        api.vms.add(vm = vm_params) #trying to add the machine
        print("Virtual machine '%s' added." % vm_name) #gives the output if it is successful
    except Exception as ex: 
        print("Adding virtual machine '%s' failed: %s" % (vm_name, ex)) #gives exception, if any
        api.disconnect()
except Exception as ex:
     print(f"Exception occured: {ex}" ) #gives exception, if any

Output:

Virtual machine ‘vm_demo’ added

Python Network Forensics

In modern days, network forensics environment investing comes across many difficulties. These problems can be used for a contravention report, assessments execution, about susceptibility, or validation of regularity of the compliances. Let’s see some terminology related to network forensics.

1. Client – The part of the client-server architecture that runs personal computers and workstations.

2. Server – It is a part of the client-server architecture that executes the client’s request.

3. Protocols – The set of rules that must be followed while data transfer between the client and the server.

4. Websockets – WebSockets are protocols that provide rules for communication over the TCP connection. It allows bi-direction messages.

With the help of these protocols, we can authenticate, send or received information from third-party users. But, we need to encrypt for a secure channel.

Example of network forensics in Python:

import socket  

# creating a socket object  instance
sock_obj = socket.socket(socket.AF_INET, socket.SOCK_STREAM)  

# getting local host name  
host = socket.gethostname()  
port = 8080
# forming connection to hostname on the specified port.  
sock_obj.connect((host, port))  
# Setting upperlimit for the number of bytes to be recieved
temp = sock_obj.recv(1024)  

print("Waiting for the connection")  
sock.close()  

Output:

Waiting for the connection

Scapy and Dshell in Python

1. Python Dshell:

This is a network forensic analysis Python toolkit. It was developed by US Army Research Laboratory and released open-source in 2014. It makes the forensic investigation easy.

Dshell has the following decoders:

a. dns: It extracts the DNS-related queries
b. reservedips: It helps in identifying solutions for the DNS problems
c. large-flows: It lists net flows
d. rip-http: It extracts the files from HTTP traffic
e. Protocols: It helps in identifying non-standard protocols

2. Python Scapy:

It is also a Python-based tool that analyzes and manipulates network traffic. We can analyze packet manipulation, capture the packets of a wide number of protocols and decode them. It gives a detailed report about network traffic.

Example of scapy in Python:

#Importing scapy and GeoIP toolkit  
import scapy, GeoIP   
from scapy import *  
  
geoIp = GeoIP.new(GeoIP.GEOIP_MEMORY_CACHE) 

def locatePackage(package):  
    
    # extracts the IP address of source 
    src_add = package.getlayer(IP).src  
    # extracts the IP address of destination
    dest_add = package.getlayer(IP).dst
    
    # getting the country details of source  
    src_country = geoIp.country_code_by_addr(src_add) 
    # getting the country details of destination
    dest_country = geoIp.country_code_by_addr(dest_add)  
      
    print ("source"+sourceCountry+'->'+"destination"+dest_country)

Output:

source INDIA -> destination USA

Searching in Python

Searching is the most common and important part of forensics. Keyword searching helps in finding strong evidence. One needs experience and knowledge to get the information from the deleted messages.

Python provides various built-in modules for the search operation. The investigator can get the results using the keywords such as “who”, “what”, “where”, “when”, “which”, etc.

Example of searching:

str1 = "Searching for the evidence by searching keyword"  
str2 = "evidence" 
str3="Python"
  
#searches for the str2 in the str1. If found gives the index of first occurence else it gives -1  
print(str1.find(str2))  
print(str1.find(str3))  
print(str1.find(str2, 10))  #setting the start index from where the  search needs to be done as 10 

Output:

18
-1
18

Indexing in python

Indexing is a feature that can be used to get potential evidence from the files. If the evidence can be restricted within the memory snapshot, a disk image, a file, or a network trace, it reduces the time for searching. The indexing can also be used to find the keywords.

Example of indexing:

list1 = [123, 'Photos', 'forensics', 'details']  
  
print("Index of forensics : ", list1.index('forensics'))  #finding the index of forensics
print("Index for 123 : ", list1.index(123))  #finding the index of 123
  
str1 = "Searching for the evidence by searching keyword"  
str2 = "evidence"  
  
print("Index of the evidence is: ",str1.index(str2))

Output:

Index of forensics : 2
Index for 123 : 0
Index of the evidence is: 18

Image Library in python

We deal with different forms of data ranging from simple data structures to complex images. Python provides the library named PIL for handling the information stored in the images.It also supports file formats, graphics and also includes powerful image processing tools.

Extracting information from the images involves the following steps:
1. Importing the images
2. Extracting data using PIL
3. Showing the data in array form
4. Arrange this data for the analysis

Example of image forensics:

from PIL import Image 

image = Image.open('D:\coins.jpg', 'r')  
pixel_val = list(image.getdata())  
flat_pixels = [x for sets in pixel_val for x in sets]  
print(max(flat_pixels)) #finding maximum value of the pixel in the image

Output:

255

Mobile Forensics in Python

We can use the non standard form of information as evidence. Nowadays, smartphones are widely used in investigation, processed. Phone calls, photos, smartphones, and messages are considered as evidence.

The android smartphones either use the PIN or alphanumeric password. The password length lies between 4 and 16 digits/characters range. We will see an example to get through the lock screen to extract data. The password gets stored inside a file password.key in /data/system in the form of SHA1-hashsum and MD5-hashsum.

Example of mobile forensics:

public byte[] passwordToHash(String password) {  
  if (password == null) {  
     return null;  
  }  
  String algo = null;  
  byte[] hashed_pass = null;  
  try {  
     byte[] salted_password = (password + getSalt()).getBytes();  
     byte[] sha1 = MessageDigest.getInstance(algo = "SHA-1").digest(salted_password);  
     byte[] md5 = MessageDigest.getInstance(algo = "MD5").digest(salted_password);  
     hashed_pass = (toHex(sha1) + toHex(md5)).getBytes();  
  } catch (NoSuchAlgorithmException e) {  
     Log.w(TAG, "Failed to encode string because of missing algorithm: " + algo);  
  }  
  return hashed;  
}  

Quiz on Python Forensics

Conclusion

Finally, we are done with the forensics in Python. Hoping that you learned some new concepts reading this article. Happy learning!

Did you like this article? If Yes, please give PythonGeeks 5 Stars on Google | Facebook


1 Response

  1. Shakeel Ahmed says:

    how are you buddy

Leave a Reply

Your email address will not be published. Required fields are marked *