Lionic Regular Expression Solutions

1. Overview

Lionic is one of the few regular expression vendors which also have the market proven Deep Packet Inspection (DPI) applications such as anti-virus, anti-intrusion or others. Most competitors have the hardware or software regular expression product only. Because of our DPI applications development experiences, we know how to design the underlying searching mechanism for working with upper applications crystal clear, at least all the Lionic DPI applications. Lionic DPI solutions are based on Lionic Regular Expression solutions.

So all the Lionic RE solutions include a lightweight and efficient search mechanism for plain strings only in addition to the regular expression part. It is very convenient for customers. In real world, a lightweight search mechanism is usually adopted to do the pre-filter first. If the pre-filter found some clues, the regular expression rules are executed for precise checking. It is wasting both time and space to use the regular expression mechanism for searching plain strings only.

2. Both FA and PFAC are Multi-Pattern Search Engine

The Lionic regular expression solutions contain two search engines - “PFAC” and “FA”. “PFAC” is the code name for Lionic’s internal research project about pre-filter aho-corasick. And the “FA” is the code name for Lionic’s internal research project about finite automata. Both PFAC and FA are multi-pattern search engines (MPSE). That is, both engines can search many patterns in data simultaneously.

Although both search engines are derived from the aho-corasick algorithm and the automata theory, Lionic has its own modifications and enhancements to meet the needs of real world. Several important techniques inside both projects are patented in USA and some other countries. Users may use either one of the two methods for different situations.

3. The PFAC inside the Lionic RE Solutions

The aho-corasick algorithm is kind of simplified automata. This method is usually used as the pre-filter of intrusion prevention system like Snort or some other applications. However, in the Lionic version of aho-corasick method we added an internal pre-filter which is totally different with the application level pre-filter. That’s why the code name is “PFAC” (Pre-Filter AC). This improves the performance greatly in many DPI cases.

Besides the above speed enhancement, Lionic also have many other modifications to this version of aho-corasick method. It may be easy for customers to find many other versions of existing aho-corasick source code. But the “PFAC” inside the Lionic regular expression solutions is specially modified for the DPI applications. It still saves a lot of time compared to finding a working one.

4. The FA inside the Lionic RE Solutions

Let us talk about the regular expression part. A good and practical RE(regular expression) implementation should have the following characteristics -

  1. A good RE implementation should be able to execute many thousands of regular expressions simultaneously and efficiently. If the multiple regular expressions are converted into a single automaton, this approach will have the deterministic performance but search graph will be bigger. If the multiple regular expressions are converted into multiple automata, the sum of all automata sizes will be smaller but the performance will become slower.
  2. A good RE implementation should be able to recognize patterns across pieces of data. Sometimes the searching data is divided into many pieces and sequentially feeding into the RE mechanism. For example, a network stream may contain many thousands of network packets. It is dangerous if a computer virus is across packets and a DPI is too useless to find it.
  3. A good RE implementation should be able to support many concurrent sessions. There are many TCP connections in the real world. For example, you may watch Youtube videos and run BitTorrent at the same time. Meanwhile, some system daemons may connect to some servers.
  4. A good RE implementation should have some techniques to make the search graph as compact as possible to ease the states explosion problem of regular expression. In regular expression, wildcard or similar symbols are easy to cause the amount of all search graph states grow up exponentially.

The “FA” inside the Lionic regular expression solutions is certainly a good RE implementation and obtained several patents in several countries. It is funny that some companies use “memcmp()” as their searching mechanism for network packets. The popular PCRE(Perl Compatible Regular Expression) is a implementation which supports complete regular expression syntax but very slow.

5. The Possible Applications of the Lionic RE Solutions

As we have mentioned that the Lionic regular expression solutions are developed not only for network packets but also for other multiple data streams. They can be applied to various applications. These applications include but not limit to the following areas -

  • Next Generation Firewalls (NGFW)
  • Anti-Virus (AV)
  • Intrusion Prevention Systems (IPS)
  • Application Identification
  • Device Identification
  • The Application based QoS feature inside the SD-WAN
  • Smart NICs
  • DDoS Mitigation
  • Network Monitoring
  • Data Loss Prevention (DLP)
  • Grammar based Content Processing
  • URL, Spam and Adware filtering
  • Advanced auditing and policing of user/application security policies
  • Financial data mining - parsing of streamed financial feeds
  • Memory Introspection
  • Natural Language Processing (NLP)
  • Sentiment Analysis
  • Big Data database acceleration (Spark, Hadoop, etc.)
  • Computational Storage

However, only the good search mechanism does nothing. Developers still need to combine the Lionic regular expression solutions with the application specific domain knowledge. For example, if you want to develop anti-virus network gateway feature, you need to develop an intercepting packets mechanism and have the Lionic regular expression solutions to scan all the intercepted packets. Also, the anti-virus experts need to extract suitable patterns for the anti-virus signatures after they studied the virus. The other applications are the same - the Lionic regular expression solutions and the application specific domain knowledge.

6. The Lionic RE and DPI Solutions

Lionic already have several market proven DPI applications which include anti-virus, anti-intrusion, anti-webthreat, application identification and device identification. Some other solutions and extensions are not related to DPI but Lionic still develops them for the complete solutions goal. They can be categorized as security solutions and content management solutions. We do recommend customers to use our existing DPI solutions instead of developing applications based on regular expression products.

If customers need regular expression mechanism only and will develop their own application, Lionic has both hardware accelerated version and the pure software version of regular expression mechanism. They have been market proven for many years and are ready to ship - Argus LA3000 and RE-Soft.


7. Hardware Accelerated FA and PFAC -
Lionic Argus LA3000 Silicon IP

The Lionic Argus LA3000 is a regular expression silicon IP designed for DPI (deep packet inspection). Supporting layer-7 application has become a demanding requirement for current security/content-aware network equipment. Many of such applications rely on DPI to achieve their functionality, such as anti-virus, IPS, traffic classification, and so on. Lionic Argus LA3000 supports pattern matching for in-and-out bound data and utilizes patented technologies to provide excellent performance with minimal memory usage and signature maintenance requirements. A list of regular expressions is used to represent the patterns to be matched. To perform pattern matching operations, the regular expressions are compiled (or translated) first and then download the inspection codes produced by the compiler into the core.


Features -

  • Up to 2.5Gbps of executing any amount(only limited by memory) of regular expression rules
  • Support more complexity regular expression and plain string patterns
  • Support Max 12 Parallel Content Inspection Engines
  • Support Max 256 independent pattern sets for various applications
  • Support Max 4 million patterns per pattern set
  • Less memory footprint
    • Compressed finite automata consumes less memory(Patented in USA and several countries).
    • Supports special regular expression syntax with hardware circuit(Patented in USA and several countries).
  • Cross packet searching with minimal overhead
  • Content inspection across packet boundaries with minimal overhead
  • Compatible AXI v3 and v4 specification

8. Pure Software FA and PFAC -
Lionic RE-Soft Engine

Besides the Lionic hardware accelerated regular expression silicon IP, Lionic also provides the software regular expression and PFAC searching engine - RE-Soft. The functionalities of RE-Soft is all the same as hardware accelerated Argus LA3000. It provides the same APIs as the Argus Universal Driver Interface.

For the platform without hardware accelerated Argus LA3000, Lionic can still provides all the same function by RE-Soft. The Lionic DPI (Deep Packet Inspection) solutions like Anti-Virus, Intrusion Prevention, and so on can utilize either the hardware or the software version of Lionic regular expression solutions.