Restore protocol model / Reversing communication protocol
What is this?
This is a process of restoration of communication protocol internals: communication method, encryption information, data structures and logic component of communication.
Why do we need this?
Restoration of protocol information could be interested to many customers. Here are the reasons why it could be useful to know how communicate that or another application:
– Security reasons (what and where your data transfers, when and how often it does this, how safe it’s doing);
– Source code or documentation lost and additional development is required;
– Knowing of protocol specification could give information about application and server internals;
– Protocol model could be used to make security testing to prevent hackers attacks in future;
– Also protocol analysis could be used to compose list of recommendations aimed to improve investigating protocol (how to speed up, make it more secure or safe traffic);
Another area where protocol reversing are applied is malware analyses laboratories, used to restore botnet infrastructure to find way how to shutdown or disarm it. In time knowledge of malware protocol can stop spreading before it becomes an epidemic or harm community.
How hard is this?
The study protocol should be performed by qualified professionals, as requires in-depth knowledge in the field of networking, operating systems, and a thorough knowledge of security systems and cryptography. Also, the analyst must have extensive experience reversing applications and has a phenomenal concentration and analytical mind. Whole our team is composed of such individuals, each of whom highly responsible and knows his job perfectly.
Restore protocol algorithm
Here is step by step walkthrough how do we do this and what problems we solve when doing own job. Note that this algorithm isn’t elixir for all situations, each product has own specific and we make each protocol analyses considering all of its features.
Setup environment
First thing what we need is to setup proper environment to start analysis. This environment should considerate specific hardware configuration, operational system and software configuration, geographical location and isolation. Virtual environment is enough in most cases, but if it is required we provide proper stand alone dedicated network. This helps us to make copy of original working environment.
Define used protocol
Definition of used protocol is fundamental step before any analysis could be made, this could be TCP/IP, NetBIOS, IPX or something more specific. In most cases TCP/IP used as well it is part of internet.
Define data transfer method
Data transfer method determines how we will further analyze the protocol. It can be as normal TCP/IP connection or a series of unrelated DNS queries each to pass a small piece of data.
Get proper connection
If the host changed each connection here we should determine what rules is subordinate. It is required to retrieve information that will give comprehensive information how that connection made, when and on what it depends, what can triggers it.
First data
Next step is to get the first data of protocol. We have a huge amount of tools to do this as complex solution in shortest terms, even for most secured connections. First data will help to know more about protocol internals. Often signature scanning can tell some basic information on how it transferred, is it plain, SSL encrypted, or something else. After this we should get enough information to start packing analysis. Signature scanning also could help on this stage, for example zip, zlib or any other standard packing methodology could be determined by signatures. If nothing of standard is suitable but the data seems still packed so it’s time of disassembly and any debugging tools that may shed light on the process of data unpacking/decrypting. Often developers of software love to make custom packing solutions to prevent easy protocol analyses and increase time for this process.
Second data
Continuing the analysis answer the question: whether or not the protocol stream based? If the protocol is stream based then another analysis should be made to improve understanding of data transfer mechanism. Not knowing streaming application depending doomed to loss data and connection. It is required to determine package boundaries on this stage. Find the way to read and write on data channel so another side can properly process an answer. If the package sizes can vary then to complete this stage it is required to determine size of package. It is often translated in the package header, but it could be everywhere and encryption as the way of protection is not excluded.
More data
The last question to complete protocol low level understandings: is it Synchronous / Asynchronous? On this stage we determine what should application do to work with protocol properly and answer in time? Synchronous protocols are easily to understand. The both methods are popular and each one has own sphere of application. Programmers often like to make own package counters and protect them with some specific keys which is generated by some hardcore algorithm. Only detailed analyses of executed code can give full information on how it is done.
Data store methods
On this stage we determine how that data stored inside the packages:
– 7-bit / 8-bit encoding
– normal byte order or reverse byte order
– how the strings stored and terminated
– specific of negative numbers
– specific of float numbers
– specific of long (64 bit) numbers
– boolean types
– fields separator / terminator
– information scheme (HTML, XML, …)
Package structure
When the store methods are investigated the next stage is to get knowledge of internal structure for each used package. Recently host can answer in different ways so the analytic should determine and describe each possible request/answer and fields for each kind of request/answer. Understanding the logical component of the product is to come at this stage, so analytic will be ready to restore protocol model.
Restoring complete model
The last stage of analysis is to recreate all necessary documentation to completely describe communication protocol in details, so programmer can recreate it using this model and it will be completely identical to original application.
