History of experix
The user's view of experix
The structure of experix
Asynchronous, Timed and Concurrent Execution
Hardware Interfaces

Downloads, Information, Screenshots, Forum hosted by SourceForge

History of experix

In the late 70's, at Cornell University, I made our first cell poker for studying the mechanical properties of living cells. Another person made a signal averager that accepted simple commands over a serial line. For people to do experiments, we needed software that would translate a high-level concept into machine actions. For example, the poke command would do the following: output a motor control signal while digitizing the sensor signals, make graphs of the data, derive the probe force from the position measurements and graph that, and send the data to the department's mainframe for archiving and more analysis.

We moved to Washington University, and over the years our data acquisition and control system evolved as it migrated from one platform to the next: first a VT100 terminal connected to the department's Digital VAX mainframe and the home-made signal averager; then a National Semiconductor single-board computer added to an Intel 8085 CPU system running CPM with CAMAC data acquisition modules; next a PC running extended DOS with a data acquisition card; and now a laptop PC running Linux with a PCMCIA card or (very recently) a USB data acquisition module. For the CAMAC system I wrote the cell poker operating program in NS32000 assembly language. For the DOS system I wrote what could be called the first version of experix, in Borland C with some inline assembly, using a stack and operator model for the command interpreter. The modern Linux-based experix system is written in C (and some assembly) for the GNU compiler.

When the inadequacy of DOS and its demise as a supported system forced me to update the platform, I dabbled in Microsoft Windows. I was quickly put off by the unavailability of information on how to write kernel modules. I knew that I needed to control everything, not rely on packages that I can't modify and which are guaranteed to become obsolete and unmaintainable. Then I found Linux, or it found me, and the program has been growing happily ever since.

I like command strings. I can't tolerate GUIs that clutter the screen with inscrutable icons which need to be equipped with more-or-less informative messages that pop up if I park the mouse there long enough. I lack the patience to search through a tree of pull-down menus when I could have typed a command in half a second. That's no way to run an experiment when you need to be thinking about the science and your sample is expiring. It may be suitable for a system that has been totally characterized and simplified to the point where it nearly all fits in one screen. It definitely does not work for a complicated, buggy home-made rig where people always want to do something I didn't think about when I wrote the software. And don't get me started on the wiring-diagram concept of data collection/analysis. It's completely inaccessible to the users and not at all a natural way to approach most of the problems that I have encountered.

The user's view of experix

The user goes to his directory and invokes an experix script file. That starts experix and directs it to a screen layout file, and defines commands and variables for the application. The goal in writing the application script is to find out what is being done in the experiment, and divide that into a hierarchy of logically separate tasks: perform a measurement, analyze and present the data, store the data, review old data and so on. The commands that are intended for the operator to use routinely should be as simple as possible and should have easily-remembered names and arguments and useful help messages.

The program presents a command prompt. The user types a command string, using readline editing and history functions. A user-given command string would typically be something short and simple, including a few of the commands that are defined in the application file. It may, however, be arbitrarily long and include any commands, variables and operators in the experix core as well as commands and variables defined in the loaded script files, and ones that the user creates as he works. A general command string consists of tokens separated by whitespace, and execution proceeds in two phases. In the setup phase, some syntax checks are done and branch operators and labels are identified. If syntax errors are found at this stage, an error message is given and the command is aborted. In the second phase of command execution, tokens are identified and acted on sequentially. Command tokens cause actions in which data items from the experix stack and variables are used and changed. These are the general categories of command tokens, with a few examples for each.

help requests

   ?fft   ??%

information (view or edit help file)

stack manipulators

   \d   3\p   \2,3S

manage the stack

stack information

   \1;t   2\;b   \:

classify stack objects, get stack statistics

command branches

   $3   !=0$a.   $?-10015

change execution point in command string

suspended-command-tail

   \/>   \//   \/D

use command-tail arguments

command file exec

   &../dist/xpx/graftrix

run commands from a file

numbers

   123   .12e3   #123   #x7b

put numbers on the stack

numerical constants

   ;e   ;pi   ;ln2

put certain useful numbers on the stack

arithmetic operators

   +   -   *   /   %

do arithmetic on numerical stack objects

function operators

   .ln   .sin   .!

apply functions on numerical stack objects

comparison operators

   ==   !=   <   >=

do comparisons of numerical stack objects

logical operators

   :A   :c   :*   :S

logical and integer operations on stack objects

local variable ops

   ,s   ,.c   ,/   ,'p

use command string local variables

array element ops

   [   [s   [r=

access elements, ranges and subspaces of arrays

array block ops

   ]   ]=   ]+   ]>s

create and decompose arrays

random number ops

   ]R   ]P   ]G

make random deviates of arrays

command sub-strings

   "quoted strings"   {braced    strings}   ''names

text, filepaths, device commands; experix command strings; command item names

complex and polar ops

   )   )p   (r   (*

make and use complex and polar numbers

data type conversions

   .>1i.   .>2d   .>W

convert numbers from one type to another

names of variables

   pokr   Asegs   fname

use named variables (numbers, arrays, etc.)

names of functions

   usb   file   graph

perform functions given as compiled code

names of commands

   runpokr   bm2xpand   gral01c

perform commands given as experix code

named item references

   'usb   'fname1   'runpokr

pointers to variables, functions, commands

file control objects

   datafile1   junkFCO

access files and pipes

device control objects

   HAN_pokr1   HAN_daq

access devices, such as USB

thread control objects

   THR_runpokr   THR_12

run experix commands in threads

Operators, functions and commands get arguments from the stack and leave results there. This leads naturally to a programming style called reverse Polish notation, which is significantly generalized in experix. All of them modify their actions according to what objects are on the stack, and functions and commands can have command-tail arguments to further modify their actions. For simplicity in typing commands, writing script files and maintaining the session log file, the use of menus and control keys is avoided. Conciseness is favored over explanatory names. Everything is included in this paradigm. This all makes experix script look very strange. Here is an example, and the picture that it creates.

20 ;pi * 2048 / 2048 ]+ .sin .exp ]G 29 graph/swK 50 fft/l 29 graph/STzR





After 20 ;pi the stack has the number 20 in level 2 and pi (3.14...) in level 1. The * operator removes these and puts their product in level 1. After 2048 / level 1 contains 20*pi/2048. The ]+ operator takes two numbers and returns a ramp array. .sin, .exp and ]G transform the number or array in stack level 1, so that it becomes the Gaussian-distributed random deviate of the exponential of the sine of the ramp. The graph function (when it finds just a number in level 1) draws a graph of the array in level 2, in the display region whose number is level 1. The fft function as used here transforms the array in level 1 by applying a low-pass Fourier filter. Both graph and fft have command-tail arguments that qualify their actions.

The stack is a data structure used mostly, but by no means exclusively, in a first-in-first-out manner. It holds objects of many kinds: numbers in different integer and floating-point formats; complex and polar floating- point numbers; multi-dimensional arrays of all kinds of numbers; strings; references to variables, functions and commands; special objects for file access, device control and thread control. Operators, functions and commands may use any number of arguments from the stack and also variables in the program, and they may alter the stack and variables. The manner in which they do this is described in help files that are accessed by the help requests. Commands, functions and operators are "overloaded" so that where it makes sense, the same thing works on all data types, and on arrays as well as single numbers. Many functions have "side effects" (which are usually their raison d'etre) such as displaying text and/or graphs, reading and writing files, and operating special devices.

When the interpreter encounters a token which is the name of a command string, the present command string is suspended and the named one is submitted for execution. If that one finishes without error, execution of the suspended command string resumes. Execution of command tokens continues until the end of the command string is reached or an error condition arises. In case of an error, diagnostic messages show the kind of error and the execution point in the command string and also in the suspended command strings that led to the present one.

When a top-level command in the main thread finishes, experix displays a few of the stack levels, showing the data type, string length or array dimensions, the value or an exerpt of that, and some other information for pointers and special objects, including the beginning of the help message. Then it gives the prompt and awaits user input.

Console messages from experix are dispatched by a function which receives the message and a route code. There are route codes for prompt, errors, warnings, stack display, command string display, help and other things. On a text-only display the route may be ignored and the messages simply printed as they are delivered. On a graphics display, the route determines the font, colors and screen region for the message. Many messages include color and reverse-video escape sequences for clarity in the console display. The dispatch function records the messages in a log file, and this can be reviewed later to find out what the user is doing, what the problems and pitfalls are and how to improve commands and make new ones.

The structure of experix

The program is started with a command-line argument that directs it to a screen layout file. It execs a process called svgaserv, which accepts commands through a fifo and manipulates the graphics framebuffer. The screen layout file contains svgaserv commands that define a video display region (VDR) for each route code used by experix's message dispatch function. The VDR contains the screen-relative coordinates, print font, number of print lines, colors, and various mode settings, and VDR-relative coordinates for data plotting. All screen displays are done by writing svgaserv commands, which include the VDR number, to the svgaserv fifo.

A dedicated thread and a synchronization flag are used to get command input via readline calls. This allows experix to run commands from the idle and timer/signal queues while it is receiving command input from the user. The prompt, key echos and cursor movements are displayed via stdout. In graphics mode, experix directs stdout to a pipe, and starts a thread which reads characters from that pipe and packages them into svgaserv commands. Thus, the command prompt and user input appear where they should, while other things may be going on in the display.

A command string is made of tokens, which are evaluated sequentially except when branches are encountered. A huge and ungainly block of code implements a decision tree designed to minimize the time needed to determine what each token is. This means that the character sequences for operators are organized into trees, for example, stack operators begin with the backslash character and math functions begin with a period. To calculate an exponential, one types .exp which may take a little time to get used to at first. This is an undisciplined command language and a confusing programming language, but it is easily extensible, and it does not force the user to think too much about syntax. It is deliberately kept extremely concise. I prefer to type a short sequence like \d where another language might use STACKLEV_1_DISCARD. While the latter might be more readable especially for someone who has not memorized many operators, it is no more easy to remember correctly.

These are the major classes of tokens, with examples and what they do:

numbers

   12   .12   +3   -5.0   #34   #xab   #W45

put a number on stack

operators

   \d   ./   .erf   ;N   :O   ]t   !=   $1   ,'e )

perform an operation

functions

   segfunc   fft   usb   thread   exec

run a compiled code item

variables

   pokrDsegs   sthread_mon   globalints

recall named data items

commands

   runpokr   bm2xpand   hist1

run an experix code item

controls

   THR_runpokr   HAN_daq

access files, pipes, threads, devices

Functions, variables, commands and controls are found by searching for the name in the command table, which may be arbitrarily long. Evaluation of a function causes the corresponding code to be executed. Evaluation of a variable causes its value to be pushed onto the stack. Evaluation of a command causes its command string to be submitted to the interpreter. Evaluation of a control puts that control on the stack for use by functions that handle them. If the name of a function, command or variable is prefixed with an apostrophe, a pointer to that item is pushed onto the stack (apostrophe is redundant with controls). Operators, functions and commands can use pointers to variables for arguments, so that they affect named data items. A pointer on the stack or in a command local variable can be used to evaluate the item, to avoid the name lookup time in a loop. If the name of a variable is postfixed with =, the value in stack level 1 is stored in that variable. Functions and commands use the stack and variables in a generally arbitrary way. Describing this is an essential part of documenting each function, command and variable.

Functions and operators are overloaded, so that +, .exp, etc. will work on different data types and on arrays as well as single numbers. What they do is described in help files that are accessed by the ? operators. Thus, the token ?+ displays the file about binary operators, and ??+ spawns an edit session on that file. Similarly, ?fft and ??fft give help on the fft function. Variables and commands can be created by the def function, and the definition can include a help string that may direct the help system to a file. For a ?? help request, the editor is started in read-only mode to prevent accidents, but the user is encouraged to improve the help files and correct them where necessary. It is possible to change the path to the help files, so that each user can have a private copy.

The command stack is a two-part data structure. Currently it is allocated at program start and not extensible, and that seems to be adequate. Each stack entry consists of a 32-bit code in part A, and a data item in part B. The codes in part A show the type of data and the amount of space that it occupies in part B. Part A grows upward in memory while part B grows downward, so that a full stack uses the whole allocation independent of what kind of data is in it. Numbers are stored directly in the stack, and arrays and strings are stored by means of a pointer in part B to a structure that contains the length or dimensions and the data. A stack level is located by indexing directly to its code in part A and adding the data lengths of all higher levels in order to find the corresponding data in part B. We avoid using a pointer in part A to the data in part B because these would have to be updated whenever the stack levels are rearranged.

Asynchronous, Timed and Concurrent Execution

Experiments often have a need for doing certain things at particular times or in response to external signals. This is accomplished by means of a function that associates a command with a timer or signal. Timed commands run from POSIX interval timers and timer expiration is handled by a dedicated thread which queues the command string for the expired timer. This thread also handles the signals which may have command strings assigned to them. While the interpreter is running a command, and while it is waiting for one, it checks the timer/signal queue and runs whatever it finds there. The user can submit a command string loop that takes long time to run, and while it is running, timer and signal commands will be done between its tokens. These are done atomically, i.e. experix does not insert another command between tokens of one from the timer/signal queue, and each one uses a private experix stack. Timer and signal commands are analogous to interrupt handlers, except that they work entirely within the command string context. They can put commands in the idle queue, which is consumed when there is no user command to do (analogous to the bottom half of an interrupt handler). Timer and signal commands are suitable for monitoring and servicing parts of an experiment where exact timing (better than what process scheduling delays will allow) is not needed and a few missed or delayed activations can be tolerated (at fast signal rates).

The thread function creates an execution thread in which the interpreter runs an experix command string concurrently with the main thread and other activities, using a private stack. There is an associated thread control object that a command in another thread can use for getting status information and ending a wait state. A threaded command can suspend itself in a timed wait or call a function that blocks for a long or unknown time, and this does not suspend the activities of other threads or block the command prompt. Threaded commands are useful for operating other programs via pipes and performing any tasks that block, such as USB data acquisition.

The exec function starts another system process, either by shell command or by exec'ing a program with argument and environment strings. It can accept an experix command string to be put in queue upon receipt of the child process termination signal. This enables synchronizing with child termination, without waiting for that.

Hardware Interfaces

This was conceived as a general-purpose environment for data acquisition and device control. The experix core needs to be agnostic as regards the peculiarities of specific daq devices, and yet able to use them effectively. The first devices that it supported were bus cards operated by kernel modules. There is a quite complicated interface for these in the xcd function, which supports device operations in a way that tries to be general and extensible while still remaining standardized. The kernel driver provides memory-mapped pages for device control and data, and a read file operation for two-way transfer of commands, status and other information. The driver's interrupt handler can be instructed to send a “new data” signal for a specified number of new data points, and then the experix signal thread queues a command string which gets new data from the data pages, updates data and graphs, and performs control actions on the device. That way the screen and data get updated almost as fast as the measurement proceeds while the command prompt remains available and other things are being done in the main thread. The driver and the “new data” commands are the only parts of this that should be device-specific.

The most general interface is provided by the file function utilizing pipes to control another program. Since file can read and write both binary and formatted data of any length, such an interface is easily designed for any equipment operating program.

The usb function does control transfers and bulk transfers with data in a string or array stack object. Details such as buffer length and endpoint address are contained in the USB device handle object for the target device, which is given as one of the stack arguments. An application script performs device discovery and creates the handle object, and then operating commands need only provide that object along with the commands or data when they invoke the usb function.

A data acquisition device usually accepts DAC values or gives ADC results in integer form, but they differ in word size (1 or 2 bytes or maybe more), format (signed or 1's complement or unsigned) and field use (left- or right-justified). Some have device registers that contain several bits or bit fields for different device actions. It is convenient to compute DAC waveforms in floating-point and to process the ADC output in floating-point for scaling and unit conversion. A general program needs to handle all the data types and formats it might encounter, and experix addresses this problem by providing 8 integer data types, 3 floating-point data types, and operators for inter-converting them. The preparation, examination and archiving of binary data is seamlessly integrated into the experix core. Most arithmetic and math operators will handle any of the data types. There are integer and logical operators to do things with integers that cannot be done correctly by implicitly converting to floating-point during the calculation. All of these operators handle arrays as well as individual numbers.