Internet Inconsistencies with R-Programming

March 26, 2021

An Internet Protocol (IP) address is one of several Domain Name System (DNS) components. Frequently, IP sequences are displayed in IPv4 and IPv6 formats. Internet directories contain further information about IP addresses. Approximate geological location, Internet Service Provider (ISP), Virtual Private Network (VPN), and Autonomous System Numbers (ASN) are a few examples of data that can be found.

If not redacted, these pieces of information can merge into one collective research platform. This tutorial can help individuals and groups who are interested in detecting internet inconsistencies.

Table of contents


As a prerequisite, the reader must have the following:

  • A device with unlimited functional capabilities.
  • Installed a functional Linux emulator (Kali Linux was chosen).
  • R-Programming software.
  • Internet access.
  • DNS mechanics and knowledge.
  • R-Programming library installations and documentation.
  • Some arithmetic experience.


One goal of this tutorial is to acknowledge internet gaps that may impact unaware individuals and groups. An additional goal is to provide probable insights to internet complexities.

It is also important for readers to understand terms and content within scope.


In this tutorial, R-Programming is used to statistically analyze data from an IPv4 address. The purpose is to gain understanding about accuracies and inaccuracies from internet activities.

As a starting point, will be the defined IP address throughout this tutorial.

Let’s get started.

Linux fundamentals

Open any Linux Shell.

For those who prefer using Linux without ROOT.

sudo apt update

As a reminder, users with permission can be in ROOT mode by entering the following line:

sudo -i

For those who prefer using Linux with ROOT.

apt update

Open a new Kali-Linux window and enter in the following:


A window should pop-up something like this:


Screen capture


Enter in the following line to install a Linux version of the R-Programming application:

sudo apt-get install r-base r-base-dev

The following screens may appear:


Screen capture


Screen capture

Alternatively, using an R-programming application can be equally effective.


Screen capture of RStudio

If not installed, the libraries used in this tutorial are listed below:

install.packages(c("Rwhois", dependencies = TRUE))  
install.packagec(c("iptools", dependencies = TRUE))   
install.packages(c("rIP", dependencies = TRUE))  
install.packages(c("rattle", dependencies = TRUE))  

Information about the IP registrar responsible can be found using this library below:


Partial Output:

index key val
1 NetRange to
3 NetName RIPE
4 NetHandle NET-45-80-0-0-1
5 Parent NET45 (NET-45-0-0-0-0)
6 NetType Early Registrations, Transferred to RIPE NCC

A server coordinates with the domain extension (example, “.us”). If a server name is included, DNS parking name servers can be displayed.

The following code shows the name servers:

("", server = "")

Partial Output:

key val
Name server
Name server

The code shown below can confirm if this IP is valid or not:



[1] TRUE

To check if the IP is using a DNS proxy or not, we will have to use the following command:

proxycheck("", api_key = proxycheck_api_key())

Displaying an IP address without a proxy will appear as shown below:


[1] "no"

An IP address can be categorized under multiple geological regions. The next step will showcase basic statistics that can be derived from an IPv4 address.

Basic statistics

Geological location of an IP address can resemble many statistical data models. The probability of determining the correct geological location can be tough, as various DNS factors are considered.

For example, the IP address overlaps with Lithuania, Germany, Cyprus, Netherlands, and Amsterdam.

Factors can include:

  • DNS variables found previously in this tutorial.
  • Directories.

A few helpful directories are listed in the table below:

Directory Name Information
RIPE Réseaux IP Européens (European IP Networks) serves Europe.
NIC Server directory for extensions.
ARIN American Registry for Internet Numbers serves North America and portions of the Caribbean.
IANA Internet Assigned Numbers Authority provides overall directory and registrar information.
CIRA Canadian Internet Registration Authority serves Canada.
  • Privacy redactions.

The country classified with this IP address is complex. Hostinger International Limited (AS47583) is the ASN hosting website responsible for IP addresses between to

With reverse IP engineering being done on, we can find five possible geological locations:

  • Lithuania (Li)
  • Cyprus (Cyp)
  • Germany (De)
  • Netherlands (Nl)
  • Amsterdam (Am)

Rattle can generate data models. A decision tree model can provide a logical breakdown. Shown below, is a manually made IP address data frame:


Typically, a decision tree selects the highest possible number as the optimal choice.

In this scenario, the countries categorized as less optimal are analyzed. Amsterdam, Netherlands, and Cyprus were shown as the top three choices. Lithuania and Germany seemed to be less optimal.


Screen capture

It is possible to evaluate variable importance from a random forest model. Variable importance is shown in the image below:


Screen capture

With the highest score of the five countries, Lithuania showed the most links to the IP address. Germany also showed some correlation. This statistical analysis using Gini found Lithuania generated higher variable importance with a value of 3087.48.

Linux reverse IP lookup

To verify validity, here is a quick code to assess:

sudo curl


  "ip": "",
  "city": "Kaunas",
  "region": "Kaunas",
  "country": "LT",
  "loc": "54.9027,23.9096",
  "org": "AS47583 Hostinger International Limited",
  "postal": "44001",
  "timezone": "Europe/Vilnius",
  "readme": ""


Screen capture

A curl function can list the possible domain names on an IP address. The code below uses reverse IP engineering.

sudo curl

Partial Output:

Did you notice the domain names listed above are companies registered with ARIN and CIRA without any connection to RIPE?

Internet inconsistencies exist as European countries usually should not have ownership of an IP address with North American company domain names.

Codes can help identify internet data as either accurate or inaccurate. A statistical coding approach can display a web of DNS relationships. Online identities can be revealed with internet directories and IP lookups.


  • Statistics can reveal internet inconsistencies.
  • Advanced data models can provide further DNS relationships.
  • Internet registrars are important to allocate IP data.

Happy coding!


Peer Review Contributions by: Srishilesh P S

About the author

Priya Kalyanakrishnan

Priya is a student of Analytics. She is skilled in other technical fields including programming in object-oriented languages, web coding, machine learning, and statistical coding. Although she may have studied the core basics, she continues to discover more as technology and interrelated areas of interests evolve.

This article was contributed by a student member of Section's Engineering Education Program. Please report any errors or innaccuracies to