Root/m1rc3/norruption/LOG

Source at commit db7ae94e646d1f27b13d2fb766a5803b66b8d5f9 created 8 years 12 days ago.
By Werner Almesberger, m1rc3/norruption/: loop5 test: cut power while in standby
1--- Tue 2011-09-06 ------------------------------------------------------------
2
3Running "loop": power-cycle, sleep 2 s, jtag-boot, sleep 70 seconds,
4which is enough to boot into FN and render "The Tunnel" for a moment,
5then power-cycle again (off-time is 5 s).
6
7Note that the test loop is "open-loop" and will cycle also past any
8problems. The first time a corrupt standby (or any other issue) is
9observed may therefore be well after the actual event.
10
111: started around 11:53 (M1 configuration is original, without locking)
12(around 500) visually checked boot process; standby was reached normally
13
14--- Wed 2011-09-07 ------------------------------------------------------------
15
16645: neocon stopped working (around 01:58)
17666: detected neocon failure at run 666: restarted neocon; urjtag failed
18     this cycle; back to normal at 667
19684: checked LEDs again (first time since ~500) and found that standby
20     may be failing. stopping test at 685 (around 02:50) for
21     investigation.
22
23Downloaded the standby bitstream:
24
25  wget https://raw.github.com/milkymist/scripts/master/scripts/reflash_m1.sh
26  chmod 755 reflash_m1.sh
27
28  ./reflash_m1.sh --read-flash
29
30Found two corruptions in the standby bitstream:
31
32  diff -u <(hexdump -C standby.fpg) <(hexdump -C /home/root/.qi/milkymist/read-flash/2011...)
33
34-00000080 00 00 4c 83 00 00 4c 87 00 00 cc 85 d8 47 cc 43 |..L...L......G.C|
35+00000080 00 00 4c 83 00 00 4c 87 00 00 c4 80 d8 47 cc 43 |..L...L......G.C|
36
37-00002840 00 08 cc 26 00 00 00 00 00 00 00 00 0c 44 00 98 |...&.........D..|
38+00002840 00 00 cc 26 00 00 00 00 00 00 00 00 0c 44 00 98 |...&.........D..|
39
40CRC-checked the partitions:
41
42  git clone git://github.com/milkymist/milkymist
43  cd milkymist/tools/
44  gcc -Wall -I. -o flterm flterm.c
45  wget http://milkymist.org/updates/current/for-rc3/boot.4e53273.bin
46  ./flterm --port /dev/ttyUSB0 --kernel boot.4e53273.bin
47
48  serialboot
49  a
50
51  only standby.fpg failed the CRC check
52
53Reflashed the standby bitstream:
54
55  wget http://milkymist.org/updates/2011-07-13/for-rc3/fjmem.bit
56  (or http://milkymist.org/updates/fjmem.bit.bz2)
57  wget http://milkymist.org/updates/current/standby.fpg
58
59  jtag
60
61  cable milkymist
62  detect
63  instruction CFG_OUT 000100 BYPASS
64  instruction CFG_IN 000101 BYPASS
65  pld load fjmem.bit
66  initbus fjmem opcode=000010
67  frequency 6000000
68  detectflash 0
69  endian big
70  flashmem 0 standby.fpg noverify
71
72M1 enters standby normally again.
73
74Running "loop2": power-cycle, sleep 2 s, jtag-boot, sleep 10 seconds,
75which is enough to begin (but not finish) booting RTEMS, then
76power-cycle again (off-time is 5 s).
77
781: started around 05:01. Observed until about 200-300 (06:00-06:30)
79that standby was okay.
80~730 (08:48): observed that standby didn't load anymore (note: due to
81a bug in labsw, power is not turned on in about 5-10% of the cycles,
82so the real cycle count should be around 650-700.)
83
84Standby bitstream difference:
85
86-00000080 00 00 4c 83 00 00 4c 87 00 00 cc 85 d8 47 cc 43 |..L...L......G.C|
87+00000080 00 00 00 00 00 00 4c 87 00 00 cc 85 d8 47 cc 43 |......L......G.C|
88
89Reflashed standby and locked the NOR. Testing with loop2 again.
90
911 (09:18): started
92... continuing through the night ...
93
94--- Thu 2011-09-08 ------------------------------------------------------------
95
963483 (03:18): standby is good so far
974325 (07:40): manually ended test. Standby is still good, but starting
98    with cycle 3704, booting RTEMS failed with
99
100    I: Booting from flash...
101    I: Loading 1889692 bytes from flash...
102    E: CRC failed (expected aa12a56a, got 68ec25e6)
103
104A CRC check yielded:
105
106Images CRC:
107  Checking : standby.fpg CRC passed (got c58e8905)
108  Checking : soc-rescue.fpg CRC passed (got 30dcc535)
109  Checking : bios-rescue.bin(CRC) CRC passed (got c78353fa)
110  Checking : splash-rescue.raw CRC passed (got e8ff824f)
111  Checking : flickernoise.fbi(rescue)(CRC) CRC passed (got aa12a56a)
112  Checking : soc.fpg CRC passed (got 3a31e737)
113  Checking : bios.bin(CRC) CRC passed (got 86e23684)
114  Checking : splash.raw CRC passed (got 978f860c)
115  Checking : flickernoise.fbi(CRC) CRC failed (expected aa12a56a, got 68ec25e6)
116
117Read back the FlickerNoise partition with
118
119  readmem 0x920000 0x0400000 fn.bin
120
121Compare with the original:
122
123  wget http://www.milkymist.org/updates/2011-07-13/flickernoise.fbi
124  md5sum flickernoise.fbi
125  5b7367e71bda306b080bde124615859b flickernoise.fbi
126
127  diff -u <(hexdump -C flickernoise.fbi) <(hexdump -C fn.bin)
128
129...
130-0008a380 28 43 00 00 34 64 00 01 58 44 00 00 5c 60 00 1e |(C..4d..XD..\`..|
131+0008a380 28 43 00 00 00 00 00 01 58 44 00 00 5c 60 00 1e |(C......XD..\`..|
132...
133
134Recovered the FN partition and unlocked the NOR:
135
136  flashmem 0x920000 flickernoise.fbi noverify
137  unlockflash 0 55
138
139New test series with script loop4. This differs from loop2 in that
140it uses "pld reconfigure" to return to standby, instead of
141power-cycling. If we still observe corruption with this test, then
142a software problem would be to blame.
143
1441 (09:11): started
1452509 (19:33): standby looks good
146
147All CRC checks pass. Verified that NOR was unlocked:
148
149  (load fjmem, etc.)
150  peek 0 # show old value
151  poke 0 0x40 0 0x0000 # Word Program
152  peek 0 # read back status (0x80 if okay, 0x92 if locked)
153  poke 0 0xff # Read Array (switch back to normal operation)
154
155--- Fri 2011-09-09 ------------------------------------------------------------
156
157New test with script "loop5". This time, we only power cycle but don't
158try to boot out of standby. The purpose of this test is to confirm that
159NOR corruption does not occur when powering down while in standby.
160
1611 (11:04): started
162

Archive Download this file

Branches:
master



interactive