OH2BNF – LSR-SDR (Large Scale Raspberry SDR)

[Up to 90 MVS instances on the cluster on Jan 16th]

Before we go any further take a look at this picture…

…which emphasizes my earlier discoveries on how advanced the mainframe technology has always been. It took almost three decades for the PC world to catch up with this…

I went ahead and spawned a total of 90 MVS instances on the cluster’s 30 currently active nodes.

Each of these – if you allow – mainframe instances was fed with a constant supply of JCL to keep them busy. This seems to be the realistic max rating for a Raspberry Pi 3.

Regarding the OpenShift I have not gotten much further as other elements of the overall stack in the cluster have kept me busy, as well as increasingly learning the MVS.

As of now, I run the Docker Swarm and Portainer. I have also been looking into the OpenFaas (Function As A Service) with some peculiar thoughts in my mind on how it might be leveraged in the overall setup.

On the Raspbian side, the load was hovering at around 4-5 with not much room for anything else. The response of the nodes was still reasonable but you could sense that I/O-spikes were pushing the little ‘berry near the max.

On the MVS instances in the Hercules emulator(s) I could see some lagging at these Raspberry Pi load levels while for example performing a TSO logon, but nothing dramatic.

SoC temperatures have remained at around 70 degrees Celsius now that I have makeshift cooling in the cluster.

After the evaluation period, went ahead and purchased the license to the ZOC Terminal (a very feature rich terminal client incl. 3270 emulation). Picture above does not do justice to the product, but represents the useful live thumbnail view.

https://www.emtec.com/zoc/index.html

With abundant free alternatives, one has to have a good reason to purchase a commercial offering, and this is exactly what I discovered while examining various – if not all available – terminal clients.

I even installed a FreeDOS 1.2 setup with otherwise very good Mocha emulator which is freeware (for dos) these days.

In the end, I find the ZOC Terminal extremely useful and worth the investment. In particular I like the tabbed versus thumbnail view and the fact that there is good support for scripting.

As a final note, I received an email from IBM as I had applied for the Master The Mainframe learning opportunity. It was 4:30 AM when I read the email and was instantly awake, feeling really happy about it.

IBM is doing a really nice job in providing these learning opportunities.

[Bunch of containers released to the wild on Jan 12th 2019]

At the docker hub as well as several git hub pages there were of course Dockerfiles for running an MVS in a container. These were for the x86_64 architecture, so I tested one of them and modified it for ARM architecture.

There were a couple of things I had to sort out in prior and one of them was setting up a private registry, for which I set up a Centos 7 VM on the ESXi host.

Launching the container(s) was performed via a simple SSH key authenticated loop since the SLURM update on raspbian to 18.03 is still underway on the nodes and in fact to be accurate, the OpenShift OKD 3.10 configuration for this hybrid CPU architecture is also incomplete.

docker run -dit -p 3505:3505 -p 3270:3270 -p 8038:8038 registry.enigma.fi:5000/mvs38j:arm-v2

AND

docker run -dit -p 3505:3506 -p 3270:3271 -p 8038:8039 registry.enigma.fi:5000/mvs38j:arm-v2

This shows a simple way of exposing container ports to unique ports on the node itself, allowing for multiple instances to execute simultaneously without port conflicts. Obviously, to reach the second container one has to use the bolded port numbers.

There are now 60 MVS containers running on the cluster, two for each participating node (30 out of the 40 Raspberries). Remaining 10 nodes are doing something else and a couple of them need a little TLC.

To keep the MVS instances busy day and night I feed them via port 3505/3506 with JCL instructed (compiler test) scripts for a variety of purposes, using hercjis command.

https://github.com/wfjm/herc-tools

https://github.com/wfjm/mvs38j-langtest

The MVS instances are running at approximately 30 – 40 MIPS rates sustained. Raspberry Pi 3 seems to have sufficient performance to execute 2-3 of these instances simultaneously, for less I/O – bound operations 4 might be possible, so we are looking at anywhere between 40 to 160 MVS containers.

To recap this what needs to be done in particular is to move the container orchestration fully within OpenShift and examine the use of a traditional HPC batch subsystem (such as SLURM) integration to  workload management outside the MVS instances.

[I had my artistic moment on Jan 7th 2019]

…this being the first instance of MVS3.8j “Turnkey” I started on a Docker container. There should be ample capacity to run 100-200 such instances…

[update on Jan 5th 2019]

After running several thousand batch jobs through the cluster’s MVS instances, I feel that it is time to start systematic studies to gain thorough understanding of the mainframe.

This is somewhat easier said than done, because a majority of the mainframe documentation tends to “start up high”, assuming that the mundane basics are already at the routine level.

But there is nothing “mundane” in mainframes.

Case “text editor”: as is common in the Unix/Linux world, first encounter with venerable vi over two decades ago was not all that pleasant. It took a bit of practice back then to become comfortable with it. If you try Google, you’ll soon see that one of the most frequent searches regarding vi editor is “how to exit” one.

However, once you become sufficiently familiar with vi, you likely consider it the best thing since sliced bread. Which it is. I don’t use Emacs.

Things get more hairy with MVS for a beginner as (I) haven’t even had a clue about the name of the editor or whether it can be invoked directly from the “READY” – prompt in TSO or only through the “RFE” . Searching Google for instructions becomes somewhat complicated, if you do not know what to search for.

I have also watched dozens of videos in the ‘Net and paused them repeatedly, trying to catch what particular key was pressed at which given point in time, or in what “mode” the editor was when a seasoned mainframe fellow edits a file.

I find all this hilarious!

It is actually quite fun to be such a beginner with a given technology. While setting up the Hercules emulator and MVS 3.8j “Turnkey” is rather simple, getting to know the beast is another thing.

Just about everything on a mainframe is radically different from a Unix or a Linux system. It is as if over two decades of sysadmin experience means nothing. That is hilarious as well.

There are multiple venues for learning more though. In fact you can even enroll on a training and be given access to a learning system, a real Z – series. This is in the curriculum prior to summer.

I have been examining the source code for a number of languages prevalent in the mainframe world since late 1950’s. I am not much of a programmer myself, so can’t really provide any in-depth analysis on this subject, but do find something highly aesthetic in COBOL and FORTRAN for example.

COBOL is poetry.

I am amazed at the quality of contributions people have put forth in the past half a century to make the mainframe what it is. Also people behind the Hercules emulator and “Turnkey” with its near relatives sure deserve a big thanks.

[update on Dec 30th 2018]

So this is one of those projects that are, apparently, in constant state of being re-evaluated and re-purposed. A week prior to Xmas I encountered something called the “Hercules” which is an emulator, allowing one to run a genuine early 80’s vintage mainframe operating system.

Of course at that point things got out of hand.

After a crash course into installing, configuring and operating a mainframe operating system I became so enthusiastic about it that I quickly spread those (mainframe instances) across the cluster.

In the picture which is black and white to commemorate more than half of a century of very smart people creating something known as the mainframe, – – you can see over 40 “mainframes”, although I could only fit 20 of the 3270 consoles on the displays at one time.

Now I am studying things like JES2, JCL, TSO, FORTRAN, ASSEMBLER, COBOL… and continue also to merge prior plans with all this. Let’s see if I can write out the concise game plan:

“Based on OpenShift (OKD) orchestration, merge a traditional HPC cluster with container technology. Execute MVS 3.8j mainframe OS in containers, involve them somehow in the processing of the I/Q data coming from SDR receivers at a lower layer of the stack, even if it requires learning obscure beautiful and highly logical programming languages.”

Yeah right. At least there is a plan. But – and as stated earlier – my New Year’s Resolution for 2019 is limited to working with the cluster, not necessarily having it “ready” or completed as a project, as it seems the cluster is primarily serving as a workbench, a tool for learning.

Hence in a way it is “ready” and serving a purpose already.

[update on Dec 8th 2018]

Working currently on docker related technologies and a merge of traditional HPC (Slurm-based) cluster with (containers). Acquired 40 USB sticks and migrated from PXE to USB booting, as docker really likes to have a true filesystem at its disposal.

Consumer-grade ADSL router with its “features” has been a major issue, causing all sorts of havoc throughout the project. PXE-booting the entire cluster would usually result in the router resetting itself. In addition to this, the router likes to reset itself every now and then, and being the default route for the entire cluster network, as well as other systems at home, it is a nuisance.

Seems there is no other option except to completely isolate the cluster to its own network, with its own infrastructural services – .

[Origins of the cluster]

I received my call sign OH2BNF on April 19th 2018, having passed General Class examination at the end of March in Helsinki. I am a member of SRAL (Suomen Radioamatööriliitto ry) – The Finnish Radio Amateurs League, and of course the OH2AAV Vantaan Radioamatöörit ry – The Radio Amateurs League of Vantaa, and finally Suomen DX-liitto. Links to each are provided immediately below

In addition to the ICOM IC-7200 transceiver, various hand held devices and a Tecsun S-2000 receiver, I have numerous SDR devices and a remote QTH project involving 40 Raspberry Pi 3 systems. In fact this system will be the main content for this page as the project moves forward.

Bitscope in Australia sells a 19″ rack enclosure allowing for installation of 40 Raspberry Pi 3 / 3+ single board computers, via Quattro Pi boards (10 of them) in a setup that supports up to 160 USB devices and most important, sufficient and simple power management for them.

Bitscope Quattro Pi board shown from both sides with 4 Raspberries installed

Assembling the enclosure takes about 2 hours total, lot of parts…

In the rack I have a 48V/25A power supply made by Suomen Muuntolaite Oy (thanks to Bror OH2BNN). Those 40 Raspberry Pi’s only require about 2A when PXE-booted and idle. Same power supply is hence used also for various other devices and converted voltages.

48V/25A PSU and Proliant DL20 Gen9

As the x86_64 hardware I use a HPE Proliant DL20 Gen9. Hypervisor is ESXi 6.5, virtual machines include Centos 7 (OpenNMS and Ganglia, likely also SLURM for cluster/grid job scheduling), Raspbian Pixel for x86_64 as the PXE server and cluster central database server, and a Windows 10 Pro for those few ham radio applications only available on Windows platform.

Raspberry Pi boards run a fairly modified version of the Raspbian. While basic features are as delivered, I have had to install quite a bit of additional software into the image. This deals primarily with the cluster type of applications, which range from frameworks such as SLURM to specific items such as MPI libraries. Raspbian being Debian based, this has been trivial to implement.

Network switch is an old 3Com 48-port 1Gb solution which I repaired by replacing 2 fans that had worn out. Above picture shows the rack during an early setup phase. Remote power management is handled with a Ratio ELECTRIC IP Power Switch, which I acquired at a paltry sum of 20 euro’s from a surplus sale. This was quite a deal, as those used to sell for several hundred euro’s.

Rack itself is a 21RU unit from Thomann. It was originally intended for audio devices, was quite cheap and offers a proper cabinet for the system. The height of the rack is about 1.5 meters and depth is less than ordinary IT racks. This was intentional but has posed a few limiting factors into choosing equipment for the solution. Most rack equipment such as servers tend to have a much larger installed depth than for example the Proliant DL20 Gen9 I specifically chose for the purpose.

Cabling with CAT6 can be a chore

There are basically two phases in the buildup. First, the whole setup must be made to function as a HPC (High Performance Computing) type of a cluster, and with full remote management capabilities for every aspect of the system.

-Ganglia showing partial load report of the cluster

Final phase is to convert the system into ham radio specific tasks.

I anticipate 20, 30… or so SDR receivers, some of them operating individually – others as a coherent array, using a common external oscillator. There may also be SDR transceivers in addition to a “real” radio such as the ICOM IC-7200. There are also numerous other devices such as antenna switches, NATO qualified Stridsberg antenna multiplicators, upconverters, LNA’s etc.

As I was eavesdropping on Santa Claus last Christmas, I began to think about a more heavy-duty setup – notice Stridsberg connected to a magnetic loop antenna

At the end of the final phase a proper remote QTH will be selected, and system located there. As I live in a high RFI environment with a particularly challenging situation (apartment building with no permission for sufficient antennas) the remote operation is really the only chance. Also I like to incorporate my 20 years of professional Unix/Linux systems administration into the project.

Primary objectives are to incorporate automated adaptivity to the system at large – for example leveraging on band condition information, WSPR (Weak Signal Propagation Report) & friends, automated signal detection and decoding, great flexibility in terms of individual cluster nodes being able to fast respond to various needs and tasks, strong emphasis in parallel processing where applicable depending on the problem type and dataset, support for multiple end users benefiting from the computing and reception capacity of the cluster – to name the most significant.

Question One: “Why?”

Because it is a lot of fun and I like to merge Unix/Linux with a new hobby, ham radio. I also quickly noticed that with all the various digital and other “modes”, not to mention frequency bands and other variables, there seemed to be so much to adjust, tune, configure when moving from one reception type or band to another. I thought that perhaps I might look into automating most of it…

Question Two: “Why Raspberries – aren’t they weak in terms of computing power?”

I am 48 years old with 20 years in Unix/Linux systems administration, including HPC computing sites such as Nokia Research Center and current employer Qualcomm. I have learned a long time ago that for any given task, sufficient performance is needed. The Raspberry Pi is a small wonderful device that is right at home with moderately heavy computing tasks, if used appropriately.

-Oh, they are sweet, aren’t they

Primary reason for having not just few but 40 of these deals with USB I/O, which I soon observed was not that great even with a fairly powerful desktop computer. Try running 4, 8 or more SDR receivers at full speed, and you’ll quickly see what I mean. Hence as we say in Finnish “lisää rautaa rajalle”, that is – I now have capacity for up to 160 USB devices which should be more than enough.

A single Raspberry will do fine with 1 or at most 2 relatively narrow bandwidth SDR receivers such as the famous RTL-SDR Blog v3 (which is the type I am primarily using). Beyond that the I/Q data eats up more networking bandwidth than is available. So just put more of them pesky Raspberries, right?

Question Three: “Are there similar setups around the world?”

Sure, many of them. I haven’t specifically seen a Bitscope enclosure with 40 Raspberries dedicated for ham radio activities, but the Raspberry and ham radio software are widely used together. There are many hams who deserve credits for inspiring this project – I have spent endless hours of reading other people’s blogs.

Someone I do want to mention specifically is Oona Räisänen (OH2EIQ). Go and check out her inspiring blog “Absorptions” at http://windytan.com – that blog got me interested into all this. Be prepared to spend hours reading…

Question Four: “Why so many Raspberries and why not a broad bandwidth XYZ receiver?”

The bandwidth available from devices other than RTL-SDR v3 is not at all the issue. In fact this project has dimensions and objectives that extend far beyond just the bandwidth question; and perhaps it needs to be clarified that the intention is specifically to use 40 Raspberry devices, not to limit the number of them.

As stated we are still in the first – infrastructural – phase, and we’ll get to the interesting stuff later on. Certainly as has been indicated there are some interesting possibilities in the forecast. Setting up a complete, remote manageable and robust HPC cluster – despite the admittedly low computing performance of the Raspberries – nevertheless takes time, in particular the PXE booting is nowhere near robust enough at the moment.

UPDATE: one faulty new Cat6 cable discovered and certain issues with a consumer grade ADSL router built-in DHCP function. Will likely move cluster itself into a separate network and provide DHCP in that segment, using a real self contained solution.

Also, port level saturation was discovered on the network switch – during PXE booting – and was resolved initially by staggering the power-on of various functionalities in the overall cluster and its controller layer. Further investigation needs to be carried out, including possibilities offered by (switch) port configurations including QoS (Quality of Service) and other features.

Initially the whole cluster seems to boot at once now with the reduced network congestion on the Raspberry connected ports on the switch. One specific node is however still having odd issues, which I will investigate in the following days.

Broadening the bandwidth of (SDR) reception seems to turn up frequently in discussions, but is really not near the top requirements for the operation. Stuff of interest rather revolves around the cluster at large, its adaptive capabilities to varying conditions, pattern detection and automation.

Parallel processing when applicable and supported by the “problem” and dataset also is one of the priorities. For the vast number of radio related software, examining the possibilities and testing, testing, testing 🙂 is what I am most looking forward to. And finally – for those with an appetite for the heavy stuff, I recommend to keep an eye on Tim O’Shea and company DeepSig Inc.

– 73 de OH2BNF

Maidenhead KP20JG