Network Traffic Forensics for Malware Detection using PCAP Analysis

Introduction :

This data analysis focuses on identifying malware presence using passive network traffic analysis on the PCAP dataset dated 2020-04-24. The file was analyzed using Wireshark without executing any malicious code. A layered approach was applied by examining DNS, HTTP, TCP, and IP-level behavior to detect suspicious communication patterns.

The analysis emphasizes indicators such as command-and-control (C2) communication, periodic beaconing, abnormal traffic bursts, and suspicious data transfers. Multiple independent observations were correlated to confirm the presence of malware activity within the network.

Objectives :

1. To identify the infected host based on abnormal network behavior
2. To detect malware communication patterns such as command-and-control (C2) and beaconing
3. To analyze DNS, HTTP, and TCP traffic for malicious indicators
4. To observe abnormal traffic patterns such as bursts, short-lived sessions, and asymmetric data flow
5. To confirm malware presence using multiple correlated evidences

PCAP DESCRIPTION :

The PCAP file contains captured network traffic from an infected Windows host communicating with external infrastructure. The dataset includes DNS queries, HTTP requests and responses, TCP sessions, and encrypted traffic over TLS.

The traffic reveals repeated connections to specific external IPs/domains, periodic communication intervals, and abnormal session behavior. These patterns indicate automated communication typically associated with malware, including possible payload delivery and command-and-control interaction.

https://www.malware-traffic-analysis.net/2020/04/24/index.html


Architecture of Work :

visual simplistic view






Procedure of work :

The given PCAP file (2020-04-24 Malware Traffic Dataset) was analyzed using Wireshark to detect suspicious network activity. Initially, the overall traffic was inspected to understand communication patterns and identify unusual behavior. Using Statistics → Conversations → IPv4, the infected host 10.4.24.104 was identified based on its frequent and abnormal communication with multiple external IP addresses.

Protocol-based filtering was then applied using dns, http, and tcp filters to analyze specific traffic types. DNS queries were examined for repetitive or unusual patterns, indicating automated behavior, while HTTP traffic was inspected using TCP stream analysis to detect possible data transfers from external servers. TCP communication further revealed continuous data exchange, suggesting command-and-control (C2) activity or payload delivery.

Finally, statistical analysis was performed using Python to compute packet size, packet rate, and throughput, and the results were visualized using graphs. These graphs showed bursty traffic patterns, high data transfer spikes, and irregular packet behavior. By correlating these indicators, the presence of malware activity in the network was confirmed.


Inference: Indicators of Malware Presence


1. High Outbound Traffic Pattern

🔸 Why it matters
Infected hosts usually initiate frequent outbound connections to communicate with attacker-controlled servers.

🔸 How it was analyzed in Wireshark
Statistics → Conversations → IPv4
Sorted by Packets and Bytes

🔸 Observation
The internal IP 10.4.24.104 shows a significantly higher number of packets exchanged with multiple external IPs such as 54.36.108.120.

🔸 Evidence 



🔸 Conclusion
The abnormal outbound communication confirms suspicious behavior and possible malware infection.


2. Internal vs External IP Analysis

🔸 Why it matters
Malware typically resides inside a private network and initiates communication with multiple external servers to receive commands or exfiltrate data.

🔸 How it was analyzed in Wireshark
Statistics → Conversations → IPv4
Identification of private IP range (10.x.x.x) vs public IP addresses

🔸 Observation
From the conversation table, the internal IP 10.0.0.149 is observed communicating with a large number of external/public IP addresses such as:

  • 8.28.7.83
  • 23.51.133.119
  • 31.13.66.19
  • 34.98.72.95
  • 35.190.29.196

The number of packets and bytes exchanged with these external IPs is significantly higher compared to other internal communications.

🔸 Evidence (Screenshot)



🔸 Conclusion
The internal host 10.0.0.149 is actively initiating communication with multiple external servers, which is abnormal behavior. This strongly indicates that the system is likely infected and communicating with external command-and-control (C2) infrastructure.


3. Bursty and Periodic HTTP Traffic Pattern (Beaconing Behavior)

🔸 Why it matters
Malware often communicates in two ways:

  • Burst traffic → when sending/receiving data
  • Low periodic traffic → small “check-in” signals (beacons)

This combination is important because normal user activity is random, but malware communication is automated and patterned.

🔸 How it was analyzed in Wireshark
Statistics → I/O Graphs

Filter applied: http.request
Interval set to: 1 second

🔸 Observation
From the graph, the HTTP traffic shows a distinct bursty pattern:

  • Initial phase shows high spikes (up to ~40 packets/sec)
  • Followed by long idle periods with almost no traffic
  • Later, small repeated spikes appear at intervals

This indicates:

  • Data transfer happens in bursts
  • Followed by quiet waiting periods
  • Then small periodic communication resumes

🔸 Evidence (Screenshot)



🔸 Conclusion
This irregular yet repeating pattern strongly suggests malware beaconing behavior. The infected host is likely:

  • Sending bulk data in bursts (possible payload transfer)
  • Then switching to low periodic communication (C2 check-ins)

Such behavior is a clear indicator of automated malware communication, not human browsing activity.


4. Excessive DNS Requests from a Single Host

🔸 Why it matters
Malware often uses DNS queries to continuously discover command-and-control (C2) servers or resolve domains dynamically. Unlike normal users, malware generates DNS traffic automatically and at high frequency, making it a strong indicator of compromise.

🔸 How it was analyzed in Wireshark
DNS traffic was filtered using:

dns

The packet list was examined to identify:

  • Source IP generating requests
  • Frequency of DNS queries
  • Variety of domains being accessed

🔸 Observation
From the DNS traffic, the internal host 10.0.0.202 is repeatedly generating DNS queries within a very short time span.

The host is resolving multiple domains such as:

  • google.com
  • yahoo.com
  • bing.com
  • amazon.com
  • cloudflare-related domains

Additionally, queries and responses appear continuously without normal browsing gaps, indicating automated behavior rather than human activity.

🔸 Evidence (Screenshot)



🔸 Conclusion
The high volume and continuous DNS queries originating from a single host (10.0.0.202) strongly indicate automated domain resolution activity. This pattern is commonly associated with malware performing:

  • Command-and-control (C2) communication
  • Domain discovery
  • Background beaconing

This behavior confirms suspicious activity and supports the presence of malware.

5. Suspicious Repeated HTTP Requests to External Server

🔸 Why it matters
Malware often communicates with external servers using HTTP requests. Unlike normal browsing, malware generates repeated and automated requests to the same server to download payloads, fetch updates, or maintain communication with a command-and-control (C2) server.

🔸 How it was analyzed in Wireshark
HTTP traffic was filtered using:

http.request

The packet list was analyzed to observe:

  • Source IP generating requests
  • Destination server
  • Type and frequency of HTTP requests

🔸 Observation
From the filtered traffic, the internal host 10.0.0.202 is sending multiple HTTP GET requests to the external IP 104.26.3.17.

Key findings:

  • A large number of GET requests are sent continuously
  • Requests include paths like:
    • /wp-content/...
    • /wp-includes/...
    • /images/...
  • The requests occur rapidly and in sequence, indicating automated behavior

This pattern is not typical human browsing, as it lacks pauses and shows structured repeated access.

🔸 Evidence (Screenshot)



🔸 Conclusion
The repeated and rapid HTTP requests from 10.0.0.202 to a single external server strongly indicate automated communication behavior. This is commonly associated with malware performing:

  • Payload retrieval
  • Command-and-control communication
  • Background data exchange

This supports the presence of suspicious or malicious activity.

6. Suspicious File Transfer and Automated HTTP Resource Activity

🔸 Why it matters

Attackers commonly use HTTP (Port 80) to transfer malicious content because it blends with normal web traffic. Instead of directly downloading executable files, modern malware often downloads multiple scripts (JavaScript), styles, or images that can later be used to execute hidden payloads or communicate with external servers.

🔸 How it was analyzed in Wireshark

HTTP objects were extracted using:

File → Export Objects → HTTP

This allows inspection of all files transferred over HTTP including scripts, images, and other web resources.

🔸 Observation

A large number of HTTP objects were observed being downloaded from external domains such as www.ubuntugeek.com, along with requests to third-party services like pagead2.googlesyndication.com and images.intellitxt.com.

The traffic included multiple JavaScript (.js) files, CSS files, and images, indicating repeated automated resource fetching. The high frequency and variety of downloaded files suggest non-human browsing behavior.

🔸 Evidence (Screenshot)





🔸 Conclusion

The presence of continuous HTTP object downloads, especially multiple JavaScript files from external domains, indicates automated behavior rather than normal user browsing. This pattern is commonly associated with malware activity where scripts are used to fetch additional payloads or establish communication with remote servers.  



  1. Anomalous LLMNR Traffic (Poisoning Risks)
    🔸 Why it matters
    LLMNR (Link-Local Multicast Name Resolution) is used by Windows systems as a fallback mechanism when DNS resolution fails. However, this protocol lacks authentication, making it vulnerable to spoofing attacks. Attackers can exploit this weakness by sending fake responses to capture sensitive information such as user credentials.

🔸 How it was analyzed in Wireshark
LLMNR activity was analyzed using:

Analyze → Expert Information

The focus was on identifying unresolved queries and abnormal LLMNR behavior.

🔸 Observation
A noticeable number of LLMNR queries were observed without receiving valid responses.

Key findings:

Multiple unresolved LLMNR requests were detected
Frequent "response missing" alerts appeared in the analysis
Repeated broadcast queries were generated across the network
No proper hostname resolution occurred for several requests

This indicates that systems are actively attempting local name resolution but failing consistently.

🔸 Evidence (Screenshot)

🔸 Conclusion
The high number of unresolved LLMNR requests indicates a vulnerable network condition. This behavior can be exploited by attackers to perform LLMNR poisoning and capture credentials, suggesting a potential security risk within the environment.

🔸 Observation
From the TCP conversation analysis, the internal host 10.0.0.149 is initiating multiple connections to various external IP addresses across ports such as 80 and 443.

Key findings:

A large number of TCP sessions are created rapidly
Each session contains low packet counts (approximately 10–40 packets)
Very minimal data transfer (only a few kilobytes per session)
Connections terminate quickly and are repeatedly re-established
Multiple different external IPs are contacted within a short time interval

This behavior does not reflect normal user activity, as typical sessions persist longer and transfer more data.

🔸 Evidence 







🔸 Conclusion
The presence of repeated, short-lived TCP sessions with minimal data exchange indicates automated communication behavior. This pattern is commonly associated with malware performing scanning, probing, or repeated connection attempts to external infrastructure, suggesting potential compromise of the host system.

  1.  DNS Anomalies Analysis

🔸 Why it matters

Frequent failed DNS queries are a strong indicator of automated behavior. Malware often uses Domain Generation Algorithms (DGA) to continuously generate and query domain names in an attempt to locate active command-and-control (C2) servers.

🔸 How it was analyzed in Wireshark

Filter applied:
dns && ip.addr == 10.0.0.149

The DNS packet list was examined to identify query patterns, response types, and domain resolution behavior.

🔸 Observation

The internal host 10.0.0.149 was repeatedly querying domains such as:

  • wpad.steelcoffee.net
  • DESKTOP-C10SKPY.steelcoffee.net

Key findings:

  • Multiple DNS queries resulted in “No such name” (NXDOMAIN) responses
  • Repeated querying of the same domains within a short time interval
  • Presence of dynamic DNS update attempts for internal naming
  • Continuous resolution attempts despite failed responses

This pattern indicates automated and persistent domain resolution attempts rather than normal user activity.

🔸 Evidence 


🔸 Conclusion

The repeated failed DNS queries and continuous lookup attempts strongly suggest automated domain generation or fallback communication behavior. This is a common malware technique used to locate active C2 infrastructure, indicating potential compromise of the host system.


  1. Anomalous Connection Frequency (High Request Rate)
    🔸 Why it matters
    A very high number of connection requests in a short time is not typical of human activity. Such behavior is usually generated by automated tools and is commonly associated with DoS attacks or aggressive brute-force attempts.

🔸 How it was analyzed in Wireshark
Statistics → I/O Graphs (Packets per Tick)

🔸 Observation
A sharp spike was observed in the traffic graph, indicating that a single host was generating a large number of packets within a very short time interval. The sudden increase in request rate reflects abnormal and automated activity.

🔸 Evidence (Screenshot)



🔸 Conclusion
The unusually high connection frequency indicates automated attack behavior. This pattern is consistent with DoS activity or aggressive scanning/brute-force attempts, suggesting malicious intent from the source host.


10. Throughput Analysis
🔸 Why it matters
Throughput reflects the volume of data transferred over time. Sudden spikes often indicate bulk data transfer, which may correspond to payload delivery or data exfiltration activities.
🔸 How it was analyzed in Scapy
Scapy was used to process the PCAP file, group packets over time intervals, and calculate bytes-per-second to visualize traffic flow.
🔸 Observation
The throughput graph showed multiple sharp spikes instead of a smooth pattern, indicating bursts of high data transfer within short time intervals rather than consistent traffic flow.
🔸 Evidence (Screenshot)



🔸 Conclusion
The presence of irregular spikes in throughput indicates burst-based data transfer behavior. This pattern is commonly associated with malware activity, confirming periods of high data movement and potential malicious operations.


  1.  Packet Rate Analysis
    🔸 Why it matters
    Packet rate indicates how frequently packets are transmitted over time. A high packet rate, especially in short bursts, often reflects automated communication such as command-and-control (C2) signaling or scripted activity rather than normal user behavior.
    🔸 How it was analyzed in Scapy
    Scapy was used to calculate the number of packets per time interval and plot the packet frequency to observe traffic patterns.
    🔸 Observation
    The packet rate graph showed sudden spikes followed by sharp drops to near zero, creating an irregular on-off pattern. This indicates bursts of activity occurring at specific intervals rather than continuous traffic flow.
    🔸 Evidence  


    🔸 Conclusion
    The fluctuating and bursty packet rate indicates automated, machine-driven communication. This pattern is consistent with malware behavior, particularly periodic communication with external servers or execution of scripted tasks.


12.  Packet Size Distribution Analysis
🔸 Why it matters
Packet size distribution helps identify the nature of network communication. Smaller packets are typically associated with control messages or command exchanges, while larger packets (close to MTU size) indicate data transfer such as file downloads or uploads.
🔸 How it was analyzed in Scapy
Scapy was used to extract packet sizes from the PCAP file and plot a histogram to visualize the distribution of packet lengths.
🔸 Observation
The analysis showed a high number of small-sized packets along with a noticeable concentration of large-sized packets near the upper range. This indicates a combination of frequent control communication and intermittent bulk data transfer.
🔸 Evidence 



🔸 Conclusion
The presence of both small and large packet sizes confirms mixed traffic behavior involving command communication and data transfer. This pattern is commonly associated with malware activity, where instructions are exchanged followed by movement of payloads or exfiltrated data.


  1.  Traffic Burstiness Analysis (Short Spikeness)
    🔸 Why it matters
    Traffic burstiness reflects the pattern of data transmission over time. Malware often communicates in short, intense bursts to avoid detection, rather than maintaining steady traffic like normal user activity.
    🔸 How it was analyzed in Scapy
    Throughput and packet rate graphs were compared to identify synchronized spikes in traffic intensity over time.
    🔸 Observation
    The traffic pattern showed repeated cycles of inactivity followed by sudden spikes of high activity. These bursts occurred for short durations and then dropped back to near-zero levels, indicating irregular and non-continuous communication.
    🔸 Evidence 




    🔸 Conclusion
    The presence of repeated high-intensity bursts confirms automated communication behavior. This bursty traffic pattern is a strong indicator of malware-driven activity, where data is transmitted in short intervals to evade detection.

  1. Asymmetric Data Flow Analysis
    🔸 Why it matters
    In normal network communication, both client and server exchange data in relatively balanced amounts. However, during malware activity such as payload delivery, the server sends a significantly larger volume of data while the client sends minimal acknowledgment packets, creating a clear asymmetry in traffic flow.

🔸 How it was analyzed in Wireshark
Statistics → Conversations → TCP

The analysis focused on comparing:
Bytes A → B and Bytes B → A columns

🔸 Observation
From the TCP conversation table, the internal host 10.0.0.149 was observed communicating with multiple external servers over ports such as 80 and 443.

Key findings:

Several sessions show noticeable imbalance between transmitted and received bytes
In multiple connections, one side transmits significantly more data than the other
The internal host sends relatively smaller data while receiving larger responses
This uneven distribution indicates one-direction dominant data transfer rather than balanced communication

🔸 Evidence (Screenshot)



🔸 Conclusion
The observed asymmetric data flow indicates that the internal host is primarily acting as a receiver in several TCP sessions. This pattern is commonly associated with payload delivery or bulk data transfer, suggesting potential malicious activity such as downloading external content or receiving instructions from remote servers.


  1. TCP Retransmissions / Network Anomalies
    🔸 Why it matters
    TCP retransmissions occur when packets are not successfully delivered and need to be resent. While occasional retransmissions are normal, repeated occurrences can indicate unstable communication, network issues, or suspicious repeated connection attempts often seen in malware activity.


 ðŸ”¸ How it was analyzed in Wireshark


TCP retransmissions were identified using:

tcp.analysis.retransmission

The packet list was examined to observe repeated transmission attempts between the same source and destination.

🔸 Observation
From the filtered traffic, the internal host 10.4.24.104 was observed repeatedly retransmitting packets to the external IP 54.36.108.120.

Key findings:

Multiple retransmission packets were observed between the same source and destination
Packets had similar sizes and sequence numbers, indicating repeated resend attempts
Retransmissions occurred within short time intervals
Communication was concentrated between a single internal and external host pair

This behavior suggests repeated attempts to deliver data that was not successfully acknowledged.

🔸 Evidence 


🔸 Conclusion
The presence of repeated TCP retransmissions between the same hosts indicates abnormal communication behavior. This pattern may be associated with unstable connections or malware attempting persistent communication with an external server, suggesting potential malicious activity.


  1. Port Scanning Behavior
    🔸 Why it matters
    Port scanning is a reconnaissance technique used by malware to discover open services and vulnerable systems. It involves sending TCP SYN packets to multiple ports or hosts to identify potential entry points.

🔸 How it was analyzed in Wireshark
TCP SYN packets were filtered using:

tcp.flags.syn == 1 && tcp.flags.ack == 0

The packet list was analyzed to observe connection initiation behavior.

🔸 Observation
From the filtered traffic, multiple TCP SYN packets were observed originating from internal hosts such as 10.0.0.202 and 10.0.0.167.

Key findings:

Several SYN packets were sent to different external IP addresses such as 35.224.99.156, 104.26.3.17, and 91.189.92.41
Connections were initiated on common service ports like 80 (HTTP) and 443 (HTTPS)
Some retransmitted SYN packets were also observed, indicating repeated connection attempts
The traffic shows multiple connection initiations within a short time interval

However, there is no strong evidence of systematic scanning across a wide range of ports or a large number of hosts.

🔸 Evidence 



🔸 Conclusion
SYN packets indicate active connection attempts by internal hosts. While repeated connection initiation is visible, the absence of large-scale multi-port probing suggests that this behavior is more consistent with automated communication or malware-driven activity rather than aggressive port scanning.


17. Unusual POST Requests


🔸 Why it matters
Malware often uses HTTP POST requests to send collected data (such as system info, credentials, or status updates) to external command-and-control (C2) servers. Unlike normal browsing, these requests are automated and repetitive.

🔸 How it was analyzed in Wireshark
HTTP POST traffic was filtered using:

http.request.method == "POST"

The packet list was examined to identify the source, destination, and frequency of POST requests.

🔸 Observation
From the filtered traffic, multiple HTTP POST requests were observed originating from internal hosts such as 10.0.0.149 and 10.0.0.202.

Key findings:

Repeated POST requests were sent to the external IP 52.20.172.27
The request path /ptmdP appears multiple times, indicating a consistent and automated communication pattern
The content type is text/plain, which is unusual for typical web browsing behavior
Requests are generated rapidly and in sequence without user interaction
A separate POST request to 104.26.3.17 (WordPress API endpoint) is also observed, but the repeated structured POSTs stand out as suspicious

This pattern strongly suggests non-human, automated activity rather than legitimate user behavior.

🔸 Evidence 



🔸 Conclusion
The repeated and structured HTTP POST requests from internal hosts to external servers indicate automated data transmission. This behavior is commonly associated with malware communicating with a command-and-control server, confirming suspicious or potentially malicious activity within the network.


  1. Encrypted TLS Communication with External Servers
    🔸 Why it matters
    Malware often uses encrypted protocols such as TLS (HTTPS) to hide its communication from detection systems. Since the payload is encrypted, attackers can send commands or exfiltrate data without being easily inspected.

🔸 How it was analyzed in Wireshark


TLS traffic was filtered using:

tls && ip.src == 10.0.0.149

The packet list was examined to identify destination IPs, handshake details, and Server Name Indication (SNI) values.

🔸 Observation
From the filtered traffic, the internal host 10.0.0.149 was observed initiating multiple TLS sessions with external IPs such as 52.230.222.68 and 13.107.21.200.

Key findings:

Repeated TLS Client Hello messages were observed, indicating multiple connection initiations
SNI values include domains like client.wns.windows.com and www.bing.com
A continuous flow of Application Data packets follows the handshake, indicating encrypted data exchange
Sessions are frequently established in short intervals, suggesting automated communication behavior

Although some domains appear legitimate, the repetitive and structured nature of these encrypted sessions indicates non-human activity.

🔸 Evidence



🔸 Conclusion
The presence of repeated encrypted TLS sessions from the internal host to external servers suggests automated communication behavior. Even though encryption hides the content, the pattern of frequent connections and continuous data exchange is consistent with malware attempting to communicate with command-and-control infrastructure.


  1. Suspicious Domain Query Patterns (TLD & Domain Structure Analysis)
    🔸 Why it matters
    Malware often communicates with attacker-controlled domains that may appear legitimate but follow automated or unusual naming patterns. Even when common TLDs like .com are used, the structure and frequency of queries can reveal suspicious behavior.

🔸 How it was analyzed in Wireshark
DNS queries were filtered using:

dns.flags.response == 0

Steps followed:
Apply filter dns.flags.response == 0
Inspect the Domain Name System → Queries → Name field
Observe domain patterns, frequency, and repetition

🔸 Observation
From the DNS query traffic, the internal host 10.0.0.202 was observed generating multiple DNS requests in rapid succession.

Key findings:

Queries include domains such as:
• connectivity-check.ubuntu.com
• daisy.ubuntu.com
• api.snapcraft.io
• wpad.steelcoffee.net
• autoupdate.geo.opera.com
www.google.com, www.bing.com, www.amazon.com

A mix of legitimate and structured domains is observed
Repeated queries to wpad.steelcoffee.net stand out as unusual and potentially suspicious
High frequency of DNS requests within a short time interval
Presence of both normal browsing domains and less familiar domains in the same flow

This combination suggests automated domain resolution rather than natural human browsing behavior.

🔸 Evidence 



🔸 Conclusion
Although many domains use common TLDs like .com, the repeated querying pattern and presence of structured or uncommon domains (such as wpad.steelcoffee.net) indicate suspicious activity. This behavior is consistent with malware attempting to resolve multiple domains for command-and-control communication or fallback infrastructure.


  1. Suspicious CDN / Cloud-Like Domain Usage
    🔸 Why it matters
    Attackers often use trusted infrastructure such as CDN services (Cloudflare, Akamai, etc.) to hide malicious communication. By blending with legitimate traffic, malware avoids detection and appears like normal web activity.

🔸 How it was analyzed in Wireshark
DNS traffic was analyzed using:

dns

The following fields were inspected:
• Domain Name System → Queries → Name
• Domain Name System → Answers (CNAME chains)

🔸 Observation
From the DNS traffic, the internal host 10.0.0.202 was observed querying multiple domains that resolve through CDN-like infrastructure.

Key findings:

Domains such as:
www.msftconnecttest.com → resolving through msedge.net (Microsoft CDN)
client.wns.windows.com → resolving through akadns.net (Akamai CDN)
autoupdate.geo.opera.com → resolving via multiple intermediate domains

These domains use CNAME redirection chains, which is typical of CDN-based routing
A mix of legitimate CDN-backed domains and other less familiar domains is observed
Repeated DNS lookups indicate automated behavior rather than casual browsing

Although these services are legitimate, malware often leverages similar infrastructure patterns to hide its communication.

🔸 Evidence 




🔸 Conclusion
The presence of CDN-backed domain resolution patterns, combined with repetitive DNS queries, suggests an attempt to blend malicious communication with legitimate traffic. This behavior is commonly associated with malware using trusted infrastructure to evade detection and maintain stealthy communication channels.


The 5 Effects of Malware

🔹 1. Unauthorized Data Exfiltration
The malware initiates outbound communication using HTTP POST requests, transmitting data to external servers. These structured and repeated POST requests indicate potential leakage of system or network information to attacker-controlled infrastructure.

🔹 2. Persistent Command-and-Control Communication
The infected host maintains continuous communication with external IP addresses through HTTP and encrypted TLS sessions. This behavior confirms the presence of command-and-control (C2) communication used to receive instructions and send status updates.

🔹 3. System Resource Misuse
Abnormal traffic patterns such as high packet rates, bursty throughput spikes, and repeated short-lived TCP sessions indicate excessive consumption of network resources. This leads to inefficient bandwidth usage and degraded system performance.

🔹 4. Evasion Using Legitimate Infrastructure
The malware leverages trusted domains and CDN-backed services (e.g., Microsoft, Akamai, Opera update servers) to disguise its communication. This blending with legitimate traffic helps evade traditional detection mechanisms.

🔹 5. Automated Network Interaction and Reconnaissance
The infected host generates repeated DNS queries and TCP connection attempts across multiple external IPs. This automated behavior suggests scanning, probing, or maintaining multiple communication channels for resilience.

New Findings from My Analysis

  • Identified infected internal hosts 10.0.0.149 and 10.0.0.202 based on abnormal outbound communication patterns
  • Detected repeated communication with external IPs such as 52.20.172.27, 104.26.3.17, and 52.230.222.68
  • Observed structured and repeated HTTP POST requests (/ptmdP) indicating automated data transmission
  • Identified continuous HTTP GET requests to WordPress resources showing scripted, non-human browsing behavior
  • Detected absence of normal user browsing patterns (no delays, repetitive request structure)
  • Observed high frequency of DNS queries from a single host, indicating automated domain resolution
  • Identified suspicious domain wpad.steelcoffee.net repeatedly queried, indicating potential malicious lookup behavior
  • Detected use of both legitimate and uncommon domains, suggesting fallback or multi-domain communication strategy
  • Observed repeated TLS Client Hello messages, indicating frequent encrypted session initiation
  • Detected continuous encrypted Application Data flow, hiding actual data exchange
  • Identified short-lived TCP sessions with low packet counts and minimal data transfer
  • Observed multiple TCP retransmissions, indicating unstable or repeated connection attempts
  • Detected asymmetric data flow, where external servers send significantly more data than the internal host
  • Observed traffic burstiness, with sudden spikes in throughput and packet rate
  • Identified combination of small packets (commands) and large packets (data transfer)
  • Detected repeated SYN packet activity, indicating continuous connection attempts
  • Observed usage of CDN-based infrastructure (msedge.net, akadns.net) to mask communication
  • Confirmed system identity using DHCP fingerprinting, linking activity to legitimate internal devices
  • Identified presence of LLMNR and MDNS traffic, indicating internal network interaction
  • Correlated multiple indicators confirming Command-and-Control (C2) communication behavior

The Use of AI

Artificial Intelligence tools were used to enhance the efficiency and depth of this malware traffic analysis. AI-assisted guidance helped in identifying critical behavioral indicators such as beaconing patterns, repeated HTTP POST requests, TLS communication anomalies, and DNS query irregularities.

AI also supported the structuring of the analysis by organizing findings into clear inferences, improving readability and logical flow. It assisted in correlating multiple independent indicators, enabling a more comprehensive understanding of the malware’s behavior without executing the malicious payload.

Additionally, AI helped in generating meaningful interpretations of network traffic patterns, such as burstiness, asymmetric communication, and automated request sequences. This significantly improved both the speed and accuracy of the investigation.

Overall, AI played a crucial role in transforming raw packet-level data into actionable insights, making the analysis more structured, detailed, and research-oriented.

Conclusion

The network traffic analysis of the provided PCAP file clearly demonstrates the presence of malware activity within the system. Through detailed inspection using Wireshark, multiple strong indicators were identified, including abnormal outbound communication, repeated connections to external IP addresses, structured HTTP POST requests, and automated HTTP GET patterns.

The analysis further revealed encrypted TLS communication, continuous DNS query activity, and the use of legitimate-looking domains to disguise malicious behavior. These patterns indicate that the malware is actively communicating with external infrastructure while attempting to evade detection mechanisms.

Behavioral indicators such as short-lived TCP sessions, retransmissions, traffic burstiness, and asymmetric data flow confirm that the observed activity is automated and not human-driven. The presence of both command-like small packets and larger data transfers strongly suggests staged communication and potential data exfiltration.

By correlating multiple independent indicators—including beaconing behavior, protocol anomalies, and encrypted communication patterns—this analysis provides strong evidence of command-and-control activity associated with malware infection.

This study highlights the effectiveness of passive network traffic analysis in detecting and understanding malware behavior without executing the malicious code, reinforcing the importance of behavioral analysis in modern cybersecurity investigations.


YouTube Video Link

https://youtu.be/ub9Q6ICB63I

GitHub Repository 

https://github.com/ramankumar2024-glitch/Malware-Traffic-Analysis

References

  1. Malware PCAP Dataset:
    https://www.malware-traffic-analysis.net/2020/04/24/index.html
  2. Original Malware Analysis Blog:
    https://www.malware-traffic-analysis.net/2020/04/24/index.html

Acknowledgement

I would like to express my sincere gratitude to my faculty Dr. Subbulakshmi T for providing the opportunity to work on this data analysis assignment. This project significantly enhanced my practical understanding of malware traffic analysis using Wireshark and deepened my knowledge of network protocols and real-world attack patterns.

I also extend my appreciation to the creators of the Malware Traffic Analysis platform for providing high-quality datasets that enabled this investigation. Their resources played a crucial role in making this analysis realistic and insightful.

Additionally, I acknowledge the contribution of AI tools, which assisted in structuring the analysis, identifying key behavioral patterns, and improving clarity in interpreting complex network data.

Finally, I thank VIT Chennai for providing the necessary academic environment and resources to successfully complete this work.

Comments