Skip to main content

Command Palette

Search for a command to run...

IP Validation Algorithm in Ruby

Updated
4 min read
IP Validation Algorithm in Ruby
A

Professional software engineer working at CoverMyMeds, an enterprise healthtech firm that ensures patients get access to the medication they need.

Primarily focused on backend and systems design through Ruby and Elixir, but experienced with JavaScript and all things frontend

Interested in learning and teaching the craft of software; technical or otherwise

Join me as I innovate by iterating!

Today we're gonna take a stab at a Medium LeetCode problem. For a long time I never even attempted these because I was worried that they'd be "too hard", but this example will demonstrate that this is not the case.

Join me as we dive into the problem, the solution written out in English, then break down my algorithm to solve this algorithm. We will learn a little bit about IP addresses, and Ruby's nifty string manipulation methods.

Let's dive in!

The Problem

Given a string queryIP, return "IPv4" if IP is a valid IPv4 address, "IPv6" if IP is a valid IPv6 address or "Neither" if IP is not a correct IP of any type.

A valid IPv4 address is an IP in the form "x[1].x[2].x[3].x[4]" where 0 <= x[i] <= 255 and x[i] cannot contain leading zeros. For example, "192.168.1.1" and "192.168.1.0" are valid IPv4 addresses while "192.168.01.1", "192.168.1.00", and "192.168@1.1" are invalid IPv4 addresses.

A valid IPv6 address is an IP in the form "x[1]:x[2]:x[3]:x[4]:x[5]:x[6]:x[7]:x[8]" where: 1 <= x[i].length <= 4

x[i] is a hexadecimal string that may contain digits, lowercase English letters ('a' to 'f') and upper-case English letters ('A' to 'F'). Leading zeros are allowed in x[i]. For example, "2001:0db8:85a3:0000:0000:8a2e:0370:7334" and "2001:db8:85a3:0:0:8A2E:0370:7334" are valid IPv6 addresses, while "2001:0db8:85a3::8A2E:037j:7334" and "02001:0db8:85a3:0000:0000:8a2e:0370:7334" are invalid IPv6 addresses.

Understanding IP Addresses

An Internet Protocol address (IP address) is a unique identifier for devices on a network. There are two types of IP addresses in common use:

  1. IPv4: The IPv4 (Internet Protocol version 4) is a numerical label assigned to each device in a computer network that uses the Internet Protocol. An IPv4 address consists of four numbers separated by periods. Each number can be between 0 and 255.

  2. IPv6: The IPv6 (Internet Protocol version 6) is the most recent version of the Internet Protocol. An IPv6 address consists of eight groups of one to four hexadecimal digits, separated by colons. Each hexadecimal digit is a number from 0-9 or a letter from A-F.

Solution in English

In our solution, we distinguish between IPv4 and IPv6 addresses by their delimiters: periods for IPv4, and colons for IPv6. We split the input string accordingly.

For each segment resulting from the split, we validate its contents and length based on the respective IPv4 and IPv6 standards.

For IPv6 addresses, we additionally check that each segment only contains valid hexadecimal characters.

If the input string matches the criteria for either IPv4 or IPv6, we return the corresponding string. If it matches neither, we return "Neither".

This approach efficiently checks the validity of the IP address by inspecting each segment individually and verifying its conformance to the standards of either IP version.

Worth noting: you can totally use RegEx to solve this problem. I'm not going to, for clarity.

Code Solution

def valid_ip_address(query_ip)
    ipv4 = query_ip.split('.')
    ipv6 = query_ip.split(':')

    return "IPv4" if ipv4.length == 4 && is_ipv4(ipv4)
    return "IPv6" if ipv6.length == 8 && is_ipv6(ipv6)

    "Neither"
end

def is_ipv4(ipv4)
    ipv4.all? do |part|
        part == part.to_i.to_s && part.to_i.between?(0, 255)
    end
end

def is_ipv6(ipv6)
    ipv6.all? do |part|
        part.length.between?(1, 4) && is_hex(part)
    end
end

def is_hex(s)
    s.each_char.all? { |char| '0123456789abcdefABCDEF'.include?(char) }
end

Code Breakdown

First, we split the arrays up. To visualize how this shakes out, I've included a couple smaller diagrams below

IPv4

IPv6

Once we've split up the IP addresses into arrays, we can then iterate through each part and validate that the data aligns with how we expect it to. Diving further into each of the methods below:

  • valid_ip_address: This is the main function that splits the IP address by '.' and ':' to separate its components, and check whether it can be a valid IPv4 or IPv6 address based on the length of each part.

  • is_ipv4: Checks if all parts of the potential IPv4 address are integers without leading zeroes and are within the range 0-255.

  • is_ipv6: Checks if all parts of the potential IPv6 address contain 1 to 4 characters and these characters are hexadecimal digits.

  • is_hex: Checks whether every character in the given string is a valid hexadecimal character, which includes the digits 0-9 and the letters a-f or A-F.

Final Thoughts

Usually, Medium LeetCode problems will add one or two extra layers of complexity that make problems more difficult. This does not make them unsolvable!

Break each one down in English first, and take your time. Once you've got it solved in English and you understand the problem, only then write code.

As you solve more and more of these problems, they become less daunting and more like challenges you're prepared to tackle.

Thanks for reading!

Y

What's wrong with regex? The trick is to break it down into components that you can read without a brain double panic. A little string catenation at run time won't hurt, I promise

https://forums.craigslist.org/?ID=327033743

Yeah, converting that to OG line numbered BASIC isn't happening anytime soon 😂

Edit: Actually, those regexps match up pretty closely to how I'd approach the problem in old BASIC, even if I didn't have anything like InStr or IndexOf and could examine a string only one character at a time in a For loop. It might look a bit like a state machine or DFA

1
A
Austin2y ago

Hahaha I felt like it would've made it 10x harder to follow the logic of this post if it needed to also be a RegEx introducer. I'm not personally opposed to using it as long as you understand what you're matching for and how to modify it in case that pattern changes in the future

I'd actually love to dive further into regex myself so maybe that could be a fun future post!

With solving confirm-a-pattern-type problems in BASIC, is using regex super common? I've only ever had to use it a handful of times in ruby/js algos, but that may be because I am spoiled by these languages' built in methods/helpers/etc

Y

Austin That's actually the first time I've used regex in quite a while. And yeah, don't get too clever, because you'll need to be at least twice as clever to fix or change it six months from now 😂 Most dialects don't support regex, even the one I usually use, X11 BASIC, doesn't support it in the Windows version. I suppose there's always DIY, like this for PC DOS QuickBasic,

https://www.tapatalk.com/groups/qbasic/regular-expressions-qbasic-t34756.html

Y

Austin Oh here we go 😁

https://leetcode.com/problems/regular-expression-matching/description/

Consider the pattern, ".*.x.*."

1
A
Austin2y ago

yetanothertroll oooo this one looks fun

1
Y

Austin The trick is, how do the .* know when to stop?

Or even "a*.a" when given "aaaa"

Y

Austin I'm running into some not-fun parts now. Constructing the NFA isn't hard, but after that I'm between a rock and a hard spot. Running the equivalent DFA on the (potentially very large) input string is fast, but if the NFA has n states, the DFA could have up to 2ⁿ. Oof. Or you could run the NFA directly, but it could be in all n states for each character in the input string. n*m vs 2ⁿ, ouch

Y

Austin I'm running into some not-fun parts now. Constructing the NFA isn't hard, but after that I'm between a rock and a hard spot. Running the equivalent DFA on the (potentially very large) input string is fast, but if the NFA has n states, the DFA could have up to 2ⁿ. Oof. Or you could run the NFA directly, but it could be in all n states for each character in the input string. n*m vs 2ⁿ, eek!