Home / Posts / Tales From Mainframe Modernization View Raw
21/05 — 2025
65.93 cm   3.9 min

Tales From Mainframe Modernization

At my last workplace, I wrote transpilers (or just compilers if you prefer) from mainframe languages (COBOL, JCL, BASIC etc.) to Java (in Rust!).

Legacy code is full of surprises. In the roughly 200k lines of COBOL that I had the (dis)pleasure of working with, I saw some wonderful hacks to get around the limitations of the system. Mainframes are also chock full of history.

Base-10 numerics

This is the first thing that stood out to me when I looked at COBOL code, a data-definition (the phrase for “variable”) in COBOL is declared like so:

                 ,-- name                
                 |          ,- type      
               __|___     __|_      
            01 HEIGHT PIC 9(3).
            --        ---       
            |          |
            |           `- picture clause (keyword)
            `- level number              

That statement declares a variable called HEIGHT with type 9(3), which is shorthand for 999, which indicates “3-digit number”. The possible values for this variable are 0 to 999!

Internationalisation

Below is another data-definition in COBOL, declaring 3 variables:

01 FOO-PERSON.
  05 FOO-NAME PIC X(5).
  05 FOO-HEIGHT PIC 9(3).

What that means is:

  • FOO-PERSON: a “group” variable consisting of two other variables
  • FOO-NAME: an alphanumeric type with 5 characters
  • FOO-HEIGHT: a numeric type with 3 digits (remember, base 10 and not base 2)

COBOL has an interesting construct called “REDEFINES”:

01 FOO-PERSON.
  05 FOO-NAME PIC X(5).
  05 FOO-HEIGHT PIC 9(3).

01 FOO-PERSONNE REDEFINES FOO-PERSON.
  05 FOO-NOM PIC X(5).
  05 FOO-TAILLE PIC 9(3).

FOO-PERSON and FOO-PERSONNE refer to the same region of memory.

I helped modernise a codebase that had clearly been worked on by a Spanish consultancy at some point, and they had decided to redefine all data definitions in Spanish.

String parsing

Here’s another fun one:

       01 FOO-PERSON.
         05 FOO-NAME PIC X(5).
         05 FOO-HEIGHT PIC 9(3).
       .
       .
       .

       MOVE "PETER" TO FOO-NAME.
       MOVE 175 TO FOO-HEIGHT.

    *> display the entire memory region
       DISPLAY FOO-PERSON.
    *> PETER175

    *> subscripting the first 7 bytes...
       DISPLAY FOO-PERSON (1:7)
    *> PETER17

So data-definitions simply describe names for regions. Which enables a clever way to parse strings:

       01 DATE.
         05 DD     PIC 9(2).
         05 FILLER PIC X.
         05 MMM    PIC A(3).
         05 FILLER PIC X.
         05 YYYY   PIC 9(4).

       .
       .
       .

       MOVE "03 MAR 2025" TO DATE.
       DISPLAY "DAY: "   DD.      *> DAY: 03
       DISPLAY "MONTH: " MMM.     *> MONTH: MAR
       DISPLAY "YEAR: "  YYYY.    *> YEAR: 2025

    *> also works:
       MOVE "03-MAR-2025" TO DATE.

Early exit

I’d see this peppered around in a few places; which I later realized was a way to trigger an abnormal end to a batch job (possibly triggering an error handling routine in the outer job control system):

       01 CONSTANT-ZERO S9(9)V9 VALUE 0.
       01 ABEND         S9(9)V9.

           .
           .
           .

       COMPUTE ABEND = CONSTANT-ZERO / CONSTANT-ZERO.

All the numbers

I have yet to find an explanation for this one, but I once found a file with just the first 800 natural numbers defined as string constants:

         01 TC0001 X(5) "00001".
         01 TC0002 X(5) "00002".
         01 TC0003 X(5) "00003".
         .
         .
       *> .... 800 lines later ....
         .
         .
         01 TC0800 X(5) "00800".

The file was definitely not generated, and I can’t imagine text editors on the mainframe were all that advanced either.

dd - disk destroyer

The DD statement in the JCL subsystem stands for “data definition”, which is largely used to describe files and IO streams used by a batch job. The dd command 1 on UNIX is named after this statement!


  1. Wikipedia - dd (Unix)↩︎

Hi.

I'm Akshay, programmer, pixel-artist & programming-language enthusiast.

I am currently building tangled.sh — a decentralized code-collaboration platform.

Reach out at oppili@libera.chat.

Home / Posts / Tales From Mainframe Modernization View Raw