Werner's Miscellanea
Sign in or create your account | Project List | Help
Werner's Miscellanea Git Source Tree
Root/
Source at commit 2e3553d458a1c1050c47e5c014113e196963e6c5 created 12 years 6 months ago. By Werner Almesberger, m1rc3/norruption/LOG: NOR corruption not observed without power-cycling | |
---|---|
1 | --- Tue 2011-09-06 ------------------------------------------------------------ |
2 | |
3 | Running "loop": power-cycle, sleep 2 s, jtag-boot, sleep 70 seconds, |
4 | which is enough to boot into FN and render "The Tunnel" for a moment, |
5 | then power-cycle again (off-time is 5 s). |
6 | |
7 | Note that the test loop is "open-loop" and will cycle also past any |
8 | problems. The first time a corrupt standby (or any other issue) is |
9 | observed may therefore be well after the actual event. |
10 | |
11 | 1: started around 11:53 (M1 configuration is original, without locking) |
12 | (around 500) visually checked boot process; standby was reached normally |
13 | |
14 | --- Wed 2011-09-07 ------------------------------------------------------------ |
15 | |
16 | 645: neocon stopped working (around 01:58) |
17 | 666: detected neocon failure at run 666: restarted neocon; urjtag failed |
18 | this cycle; back to normal at 667 |
19 | 684: checked LEDs again (first time since ~500) and found that standby |
20 | may be failing. stopping test at 685 (around 02:50) for |
21 | investigation. |
22 | |
23 | Downloaded the standby bitstream: |
24 | |
25 | wget https://raw.github.com/milkymist/scripts/master/scripts/reflash_m1.sh |
26 | chmod 755 reflash_m1.sh |
27 | |
28 | ./reflash_m1.sh --read-flash |
29 | |
30 | Found two corruptions in the standby bitstream: |
31 | |
32 | diff -u <(hexdump -C standby.fpg) <(hexdump -C /home/root/.qi/milkymist/read-flash/2011...) |
33 | |
34 | -00000080 00 00 4c 83 00 00 4c 87 00 00 cc 85 d8 47 cc 43 |..L...L......G.C| |
35 | +00000080 00 00 4c 83 00 00 4c 87 00 00 c4 80 d8 47 cc 43 |..L...L......G.C| |
36 | |
37 | -00002840 00 08 cc 26 00 00 00 00 00 00 00 00 0c 44 00 98 |...&.........D..| |
38 | +00002840 00 00 cc 26 00 00 00 00 00 00 00 00 0c 44 00 98 |...&.........D..| |
39 | |
40 | CRC-checked the partitions: |
41 | |
42 | git clone git://github.com/milkymist/milkymist |
43 | cd milkymist/tools/ |
44 | gcc -Wall -I. -o flterm flterm.c |
45 | wget http://milkymist.org/updates/current/for-rc3/boot.4e53273.bin |
46 | ./flterm --port /dev/ttyUSB0 --kernel boot.4e53273.bin |
47 | |
48 | serialboot |
49 | a |
50 | |
51 | only standby.fpg failed the CRC check |
52 | |
53 | Reflashed the standby bitstream: |
54 | |
55 | wget http://milkymist.org/updates/2011-07-13/for-rc3/fjmem.bit |
56 | (or http://milkymist.org/updates/fjmem.bit.bz2) |
57 | wget http://milkymist.org/updates/current/standby.fpg |
58 | |
59 | jtag |
60 | |
61 | cable milkymist |
62 | detect |
63 | instruction CFG_OUT 000100 BYPASS |
64 | instruction CFG_IN 000101 BYPASS |
65 | pld load fjmem.bit |
66 | initbus fjmem opcode=000010 |
67 | frequency 6000000 |
68 | detectflash 0 |
69 | endian big |
70 | flashmem 0 standby.fpg noverify |
71 | |
72 | M1 enters standby normally again. |
73 | |
74 | Running "loop2": power-cycle, sleep 2 s, jtag-boot, sleep 10 seconds, |
75 | which is enough to begin (but not finish) booting RTEMS, then |
76 | power-cycle again (off-time is 5 s). |
77 | |
78 | 1: started around 05:01. Observed until about 200-300 (06:00-06:30) |
79 | that standby was okay. |
80 | ~730 (08:48): observed that standby didn't load anymore (note: due to |
81 | a bug in labsw, power is not turned on in about 5-10% of the cycles, |
82 | so the real cycle count should be around 650-700.) |
83 | |
84 | Standby bitstream difference: |
85 | |
86 | -00000080 00 00 4c 83 00 00 4c 87 00 00 cc 85 d8 47 cc 43 |..L...L......G.C| |
87 | +00000080 00 00 00 00 00 00 4c 87 00 00 cc 85 d8 47 cc 43 |......L......G.C| |
88 | |
89 | Reflashed standby and locked the NOR. Testing with loop2 again. |
90 | |
91 | 1 (09:18): started |
92 | ... continuing through the night ... |
93 | |
94 | --- Thu 2011-09-08 ------------------------------------------------------------ |
95 | |
96 | 3483 (03:18): standby is good so far |
97 | 4325 (07:40): manually ended test. Standby is still good, but starting |
98 | with cycle 3704, booting RTEMS failed with |
99 | |
100 | I: Booting from flash... |
101 | I: Loading 1889692 bytes from flash... |
102 | E: CRC failed (expected aa12a56a, got 68ec25e6) |
103 | |
104 | A CRC check yielded: |
105 | |
106 | Images CRC: |
107 | Checking : standby.fpg CRC passed (got c58e8905) |
108 | Checking : soc-rescue.fpg CRC passed (got 30dcc535) |
109 | Checking : bios-rescue.bin(CRC) CRC passed (got c78353fa) |
110 | Checking : splash-rescue.raw CRC passed (got e8ff824f) |
111 | Checking : flickernoise.fbi(rescue)(CRC) CRC passed (got aa12a56a) |
112 | Checking : soc.fpg CRC passed (got 3a31e737) |
113 | Checking : bios.bin(CRC) CRC passed (got 86e23684) |
114 | Checking : splash.raw CRC passed (got 978f860c) |
115 | Checking : flickernoise.fbi(CRC) CRC failed (expected aa12a56a, got 68ec25e6) |
116 | |
117 | Read back the FlickerNoise partition with |
118 | |
119 | readmem 0x920000 0x0400000 fn.bin |
120 | |
121 | Compare with the original: |
122 | |
123 | wget http://www.milkymist.org/updates/2011-07-13/flickernoise.fbi |
124 | md5sum flickernoise.fbi |
125 | 5b7367e71bda306b080bde124615859b flickernoise.fbi |
126 | |
127 | diff -u <(hexdump -C flickernoise.fbi) <(hexdump -C fn.bin) |
128 | |
129 | ... |
130 | -0008a380 28 43 00 00 34 64 00 01 58 44 00 00 5c 60 00 1e |(C..4d..XD..\`..| |
131 | +0008a380 28 43 00 00 00 00 00 01 58 44 00 00 5c 60 00 1e |(C......XD..\`..| |
132 | ... |
133 | |
134 | Recovered the FN partition and unlocked the NOR: |
135 | |
136 | flashmem 0x920000 flickernoise.fbi noverify |
137 | unlockflash 0 55 |
138 | |
139 | New test series with script loop4. This differs from loop2 in that |
140 | it uses "pld reconfigure" to return to standby, instead of |
141 | power-cycling. If we still observe corruption with this test, then |
142 | a software problem would be to blame. |
143 | |
144 | 1 (09:11): started |
145 | 2509 (19:33): standby looks good |
146 | |
147 | All CRC checks pass. Verified that NOR was unlocked: |
148 | |
149 | (load fjmem, etc.) |
150 | peek 0 # show old value |
151 | poke 0 0x40 0 0x0000 # Word Program |
152 | peek 0 # read back status (0x80 if okay, 0x92 if locked) |
153 | poke 0 0xff # Read Array (switch back to normal operation) |
154 |
Branches:
master