Date: Tuesday Jul 2 10:18 PDT 1996
From: Richard Rothschild (richardr@mamacass.ucsd.edu)

HEXTE Cluster A experiences commanding upset event

Richard Rothschild, HEXTE Principal Investigator

On 1996 June 28th at approximately 2:26am UT, the RXTE SOF reported the sudden onset of error messages associated with communicating with HEXTE's A cluster. This occurred in coincidence with a passage through the South Atlantic Anomaly (SAA), a region of high charged-particle flux. It quickly became apparent that cluster A was not responding to commands. Since the HEXTE had been configured to stop science processing for the SAA passage, no scientific data were received from cluster A. The housekeeping data from cluster A were not affected and showed that all the functions of the cluster were continuing as expected and the detectors were unaffected.

In order to understand the problem, the UCSD HEXTE team and the SOF studied the error messages being generated by the spacecraft's 1773 communications system. A long command load was sent to exercise a different set of command channels to determine if they were out of commission. They were. The first attempt to correct the situation was to send a reset command from the spacecraft to the HEXTE communication chip (the BCRT chip). This did not correct the situation, but led to the conclusion that the BCRT chip was not operating properly and HEXTE was building packets with bad header information, which was leading to the errors seen.

On the assumption that the BCRT control/packet generating program in HEXTE had been corrupted, the SOF was asked to send a "soft" reboot command. This reloaded the entire HEXTE flight software from ROM, and restarted processing without removing power to the system. This processes was successful in regaining command control at 15 minutes into June 29 UT. Cluster A was then given commands to update the configuration to the present default values, and by 2am UT June 29 2:00 am the HEXTE was back to normal operation and the UCSD HEXTE team could resume writing their Cycle 2 XTE proposals, which were due the following day.

The SOF and the UCSD HEXTE team are continuing to study the event from the telemetry received. The most likely scenario is that the cluster's radiation-hard RAM (UT7156, 32K x 8-bit memory) suffered an ``upset'', i.e. one or more charged particles from the SAA effected bit-flips in the BCRT control table and/or HEXTE telemetry formatting software. This in turn caused cluster A to reject commands and produce corrupted packets. We are trying to understand the early occurence of such an event in the mission (after 6 months of operation), since the specifications of the radiation-hard RAM holding the software indicate that thousands of days should elapse between events. In any event, the HEXTE soft reboot command restored the cluster operation, and no degradation of performance has been noted since.


RXTE HEXTE
High Energy X-ray Timing Experiment


Send questions or comments to Philip Blanco
pblanco@ucsd.edu