NSF IRNC AMIS project repository
Visualizations Our traffic matrix visualizations are built for the web utilizing the D3 and Leaflet JavaScript libraries. The visualizations have been designed with lessons learned from E. Tufte’s “The Visual Display of Quantitative Information”. Some examples are:
The image below shows our traffic matrix visualization at the Autonomous System (AS) level.
Traffic matrix visualization at the AS level
Software defined network measurement and analytics We have worked to define the functionality of the layers of abstraction for a software defined network measurement and analytics stack. We are adopting the 3-layer software defined networking (SDN) model. The figure below illustrates these layers of abstraction. The Infrastructure layer consists of the software-controlled network measurement devices (e.g., network taps, switches, PerfSONAR). The Control layer consists of the network data analytics implemented using data science software ecosystems such as Python/PANDAS, Apache Spark, and the Linux Foundation’s Platform for Network Data Analytics (PNDA). We will use the Python/PANDAS data science software ecosystem to implement the SDNM control layer. Finally, the Application layer consists of applications that automate the tasks of network operators (e.g., detect and react to intrusions, detect and correct faults) and leverage the Control and Infrastructure layers.
Our primary focus is on the SDNM control layer. This layer will contain the following categories of functionality available to the application layer:
Software defined network measurement and analytics (SDNM). We adopt the 3-layer SDN model. The analytics are contained in the SDNM Control Layer
Our Python SDNM API has been implemented as a Python module. Here are the available functions:
netflow(startTime, stopTime, version=False, flow_count=False, SysUptime=True, unix_secs=True, unix_nsecs=True, engine_type=False, engine_id=False, samp_rate=False, flowOVS=False, flowAPI=False, srcaddr=True, dstaddr=True, nexthop=False, inputif=False, outputif=False, dPkts=True, dOctets=True, first=True, last=True, srcport=True, dstport=True, tcp_flags=False, prot=True, tos=False, src_as=False, dst_as=False, src_mask=False, dst_mask=False, location, port) Create a measurement task that generates Netflow records Input: Measurement Start and Stop Time as a time.struct_time optional flags specifying which Netflow fields should be kept and optionally IP address (ip) and Port number (port) for REST API Output: Measurement Task ID
netflowGetData(taskID, location, port) Collect Netflow records from a Netflow measurement task Input: Measurement task ID and optionally IP address and Port number for REST API Output: Netflow records as a PANDAS data frame
netflowLoad(filename)
Load Netflow records from a CSV file
Input: CSV filename
Output: Netflow records as a PANDAS data frame
netflowFlowtoolsLoad(filename)
Convert Flow-tools Netflow records in a file format to our Netflow v5 CSV format
Input: Filename containing NetFlow records in flow-tools format
Output: PANDAS data frame containing NetFlow records in our format
nfdumpToNetflow(nfdumpData)
Convert Netflow records in NFDUMP CSV file format to our Netflow v5 format
Input: PANDAS data frame containing NetFlow records in NFDUMP format
Output: PANDAS data frame containing NetFlow records in our format
netflowStore(filename, netflowData)
Store Netflow records to a CSV file
Input: CSV filename
Netflow records as a PANDAS data frame
Output: Error code from to_csv() PANDAS method
netflowDetectSSHIntrusion(netflowData)
Detect system intrusions via SSH in NetFlow data (uses SSHCure rules [University of Twente])
Input: Netflow records as a PANDAS data frame
Output: A dicitonary containing data regarding the SSH system intrusions detected #### Join functions:
annotateNetflow(netflowData)
Annotate Netflow records with AS and geographic information
Input: Netflow records as a PANDAS data frame
Output: Annotated Netflow records as a PANDAS data frame netflowAddApplication(netflowData)
Adds application data to Netflow records
Input: Netflow records as a PANDAS data frame
Output: Netflow records (with application data) as a PANDAS data frame #### Aggregation functions:
trafficMatrices(netflowData) Create a dictionary of traffic matrices (continent, country, AS) from Netflow records containing AS and geographic information Input: Netflow records as a PANDAS data frame Output: Dictionary with traffic matrices and label indices
netflowSummary(netflowData)
Summarize NetFlow data (byte distribution over applications/institutions)
Input: Netflow records as a PANDAS data frame
Output: Dictionary with byte distributions #### Utility functions:
ipLookup(ip)
Lookup AS number and geographic data for an IP address using local TCP service
Input: IP address in dotted decimal (string)
Output: A dictionary with IP address data: 'orgname', 'asnum', 'latitude', 'longitude', 'city', 'region', 'country', 'continent'
appLookup(port, prot)
Lookup application name using port number and IP protocol number
Input: Port number and IP protocol number
Output: Application name (string)