High-Speed Scientific Data Transfers, Scalability Impact on Security, and Internet and Routing Protocols

Ann Nguyen:
Hello. This is Ann Nguyen, Senior Associate Conference Producer with Cambridge Healthtech Institute. We are here for a podcast for the inaugural Data Transfer conference at Bio-IT World Conference & Expo 2018, which runs this May 15-17 in Boston, Massachusetts. We have on the phone Dr. Nirmala Shenoy, Professor of Information Sciences and Technologies with the Rochester Institute of Technology.

Nirmala, thank you for joining us.

Why have you focused your recent research on Internet architectures and protocols and computational challenges faced by network protocols?

Nirmala Shenoy:
Okay. I've been working in the field of networking for over 20 years now. I've seen the growth and evolution of the Internet during these years. This growth and evolution that I'm talking about has been in the size, and not the technology and protocols underlying the Internet. This sort of worries me. Over 10 years ago, the National Science Foundation here in the U.S. encouraged what we call a clean slate research ideas for a future Internet architecture. We received some funding for a simple solution that we had proposed based on network structures and their connectivity. The fact that I had some dedicated funding provided me with the opportunity to more closely investigate the current operational conditions in the Internet. And believe it or not, it's very difficult for anybody to comprehend that some of our Internet routers have routing tables as high as 750K, 750,000 entries, and this seems to continue to grow.

If this situation is to be remedied, and I'm sure we all feel that it has to be remedied, the solution that we propose should be able to smoothly and gracefully transition from the current Internet protocols and displace it incrementally. What I mean by that is we can't just say, "Turn off the Internet protocols today, and then tomorrow, this protocol is going to be there." That cannot happen, because we all know how widely the Internet protocols are deployed. So this obviously proposed a very interesting challenge, because not only do we need a solution, but the solution to be able to transition gracefully and displace the Internet protocol in an incremental manner.

We did have some ideas at that point in time for a solution to do this, but then we didn't have funding or resources to prove or validate. So around three years ago, we received another small NSF grant under a U.S. Ignite initiative. This was to provision for gigabit communications during an emergency. So now we also had a valid application for the solution that we were proposing. So the last three years, we focused on developing all the conceptual ideas that we had into prototype. We have successfully demonstrated these protocols on GENI testbed. It's a U.S.-based testbed, and we have devices all over U.S. We sort of collect them up into a network, code our network protocols, and run it in this device. We also did a comparison with the current Internet protocol that is routing protocol.

This new protocol that I am talking about actually operates in a way to bypass the Internet protocol without actually displacing it. So transitioning to this solution should be smooth. We can deploy the protocol in each domain and network incrementally. If there is a problem, the protocol can be disabled, and the Internet protocol will take over. And if everything works well, the Internet protocol and its routing protocols can be bypassed completely.

Research has always fascinated me, so this type of challenge where I'm trying to address some serious issues in the Internet is very, very fascinating to me. And that's how I landed up in this research.

Ann Nguyen:
Let's address scalability. How will supporting high-speed data transfers, in large or across multiple networks, or provisioning for larger amounts of high-speed data, impact the security of scientific data?

Nirmala Shenoy:
Talking about security, if you look at it – again, because of the fact that I've looked into the Internet protocols so closely – currently, majority of our security issues arise due to, what we call as, holes in our protocols that are being used for Internet operation. These protocols were developed 50-plus years ago. At that time, when they were developed, the security conditions were not as serious as what we have currently. So these protocols did not include security considerations bound up in the design, which is actually very important. Currently, if we develop any protocols, security is a major consideration. Our current security implementations in the Internet are mostly after the fact. After something happens, then we try to remedy it.

The more networks and scientific data crosses, the more vulnerable it is to security attacks, because the data is now visible to more networks, more people, and has more hacking points. Now coming to sending high volumes of data, operationally, there is no difference. The impact is same. But if there is a security hack, then you have more data that is going to be vulnerable at any given point in time.

Ann Nguyen:
Finally, how have Internet and routing protocols evolved in the last decade? And what further changes do you foresee in coming years?

Nirmala Shenoy:
Believe it or not, our Internet protocols are operationally still the same since they were developed 50, 60 years ago. IPv6, Version 6 as they call it, was introduced. But that would not alleviate basic problems in routing.

We have two routing protocols that the Internet uses. One is called the Border Gateway Protocol. It's used for routing across different organizations, or autonomous systems as they call them. If you look up the origin of Border Gateway Protocol or BGP, it replaced another protocol called EGP in 1989. Changes have been made to BGP, but it still uses the same traditional path vector-based routing. When the Internet service providers sort of came up, the advent of the Internet service providers, because they had business relationships, some of the routing attributes were changed in BGP. If you ask me, "Does it become more complex or bigger?" That's the evolution I've seen. But the fundamental routing approach is not any different.

Open Shortest Path First is another routing protocol, which we use within an organization. It's equally complex. The reason is both of them adopt what's called as a flat routing approach. So when networks are added, the routing table sizes increase accordingly. The number of networks will keep increasing.

So, when will the complexities stop in the Internet? Trying to explain complexity – there's an example I use for my students. Let us say a postman comes to deliver letters to houses in my street, and let us say the houses are numbered in a random manner. This requires that we provide the postman with a map, which tells where each house is located, because there's no logic in the house number assignment.

But fortunately, our house numbers are in a street, follow a logic. The reason that OSPF and BGP have such huge routing tables is that it grows with the number of networks. And the routing tables that I'm talking about are just so huge, provide a map on how to reach each and every network. A more structured approach to addressing our network or an autonomous system would simplify the routing process, like I told you, the same example of the postman delivering letters to houses. We have a logic in our address assignment. We're just not there currently in the Internet protocol.

However, we also have to keep in mind that it should be possible for the new solution to be introduced incrementally without trying to displace the Internet protocol and its routing protocols entirely. Probably, this is a very serious challenge. I mean, it's difficult to comprehend. But it's a very serious challenge, because of which, introducing new solutions is very difficult in the Internet. Unless we make the above changes, the operational conditions in the Internet will get worse.

We might wonder how the protocols sustained this road thus far. It is because of the tremendous advances we have in microchip technologies and fiber technologies which we use for transmitting traffic. These have supported the complicated operations and excessive overhead traffic that we are generating due to the non-optimal Internet communications system. We need to look to reliable and high-speed Internet lane protocol, of which I'm going to be talking about.

We had the above design considerations in mind. Also, the other thing that is a really very positive point with this protocol is it bypasses the Internet protocol completely. Operates below it. And all the security issues that could be faced at Layer3 also get bypassed. That is a huge advantage. So currently, what the Internet protocol and the routing protocols are doing in a combined fashion, the protocol that we came up with does it as a single protocol.

Ann Nguyen:
Thank you again for your insights, Nirmala.

That was Dr. Nirmala Shenoy of Rochester Institute of Technology. She'll be speaking during the Data Transfer track at Bio-IT Word, which takes place this May 15-17 in Boston.

To learn more from her, visit www.Bio-ITWorldExpo.com for registration info and enter the keycode “Podcast”.

This is Ann Nguyen. Thanks for listening.

Register Early for Maximum Savings

Modern Data Platforms and Storage Infrastructure