Jump to content

Can Bus comms issue


pfm

Recommended Posts

I have Can Bus up and running (sort of). I've installed all software and updated printer.cfg. When I start my machine, things look promising. I get a klipper blue screen with the message: "Printer is not ready. The klippy host software is attempting to connect".  But then Klipper shuts down and reports either "Lost communication with MCU 'Can0'." or "mcu 'Can0': Unable to connect".  

I thought maybe I had Can Hi and Can Lo reversed or maybe a bad cable (or bad crimp job by yours truly).  But when I measured the voltage relative to ground at my U2C, I get 2.6 ish for Can Hi (that's fine), but only 1.35 for Can Lo.  Is this to be expected? I've read that Can Lo should be between 1.5 and 2.5.  So I'm thinking I must have a hardware issue.  Has anyone else encountered this?  

 

 

Link to comment
Share on other sites

On 8/21/2023 at 5:37 AM, pfm said:

I have Can Bus up and running (sort of). I've installed all software and updated printer.cfg. When I start my machine, things look promising. I get a klipper blue screen with the message: "Printer is not ready. The klippy host software is attempting to connect".  But then Klipper shuts down and reports either "Lost communication with MCU 'Can0'." or "mcu 'Can0': Unable to connect".  

Is your can0 interface up and running. 

When you create the can0 interface by

sudo nano /etc/network/interfaces.d/can0

it should have the following content:

allow-hotplug can0
iface can0 can static
  bitrate 1000000
  up ifconfig $IFACE txqueuelen 1024

If those are correct, then ssh into the pi and issue the following commands:

ip a

this should return the following if the can network interface is up and running:

PastedGraphic17.thumb.png.295900ffb94819771ae8adca156699ec.png

Check if the communication is healthy - lost packets, etc by issuing:

ip -details -statistics link show can0

This should return the following:

PastedGraphic18.thumb.png.59eb5b0ce2d0fa982d681e22902ae89c.png

The above is an example of a new install. Obviously you should see the Rx: bytes and Tx: bytes values. And whether there are errors or not. 

Can you post the [mcu] section of the printer.cfg file please.

Another consideration is whether you recently updated klipper. If you did, then the canbus board may need to be refreshed with katapult  (previously known as CanBoot) in order for the can board to be flashed with the latest version of klipper.  If it is a Raspi2040 canbus board, my understanding is that it needs to be reflashed with this newer version.

image.thumb.png.af5c80f53df0a53188ced994539e6f86.png

Cannot comment if this is true for STMF boards.

I did have to reflash my Mellow Fly SHT36v2 with katapult in order for it to accept the latest klipper update. 

Hope this helps

 

  • Like 1
Link to comment
Share on other sites

Thanks @mvdveer.  ip a yields:

3: can0: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UP group default qlen 102
    link/can

ip -details -statistics link show can0 yields:

3: can0: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UP mode DEFAULT group default qlen 102
    link/can  promiscuity 0 minmtu 0 maxmtu 0
    can state BUS-OFF restart-ms 0
          bitrate 1000000 sample-point 0.750
          tq 62 prop-seg 5 phase-seg1 6 phase-seg2 4 sjw 1
          gs_usb: tseg1 1..16 tseg2 1..8 sjw 1..4 brp 1..1024 brp-inc 1
          clock 64000000
          re-started bus-errors arbit-lost error-warn error-pass bus-off
          0          0          0          1          1          1         numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
    RX: bytes  packets  errors  dropped missed  mcast
    8491       1125     0       0       0       0
    TX: bytes  packets  errors  dropped carrier collsns
    1503       275      0       0       0       0

My MCU section looks like this:

[mcu]
serial: /dev/serial/by-id/usb-Klipper_stm32f446xx_140046000150335331383520-if00

[mcu Can0]
canbus_uuid: ee98125fec15

 

Link to comment
Share on other sites

23 minutes ago, pfm said:

Thanks @mvdveer.  ip a yields:

3: can0: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UP group default qlen 102
    link/can

ip -details -statistics link show can0 yields:

3: can0: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UP mode DEFAULT group default qlen 102
    link/can  promiscuity 0 minmtu 0 maxmtu 0
    can state BUS-OFF restart-ms 0
          bitrate 1000000 sample-point 0.750
          tq 62 prop-seg 5 phase-seg1 6 phase-seg2 4 sjw 1
          gs_usb: tseg1 1..16 tseg2 1..8 sjw 1..4 brp 1..1024 brp-inc 1
          clock 64000000
          re-started bus-errors arbit-lost error-warn error-pass bus-off
          0          0          0          1          1          1         numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
    RX: bytes  packets  errors  dropped missed  mcast
    8491       1125     0       0       0       0
    TX: bytes  packets  errors  dropped carrier collsns
    1503       275      0       0       0       0

My MCU section looks like this:

[mcu]
serial: /dev/serial/by-id/usb-Klipper_stm32f446xx_140046000150335331383520-if00

[mcu Can0]
canbus_uuid: ee98125fec15

Communication looks good. The only other thing I can think of, is loose wiring/crimping as you suspected.

Link to comment
Share on other sites

I've done a little bit more digging, and it appears to be my serial connection that is the problem not my CanBus. On boot, klipper is looking for 

/dev/serial/by-id/usb-Klipper_stm32f446xx_140046000150335331383520-if00

But, the only thing in my /dev/serial folder is the 'by path' folder. 

paul@Trident:~ $ cd /dev/serial
paul@Trident:/dev/serial $ ls
by-path

 I don't see a 'by-id' folder. I'm not sure when this file gets written.  Should I re-flash Octopus?

Link to comment
Share on other sites

Run the following command and see if it shows your serial of the board - make sure this is the one you have in the printer.cfg file

 

ls -al /dev/serial/by-id

It should return:

 

/dev/serial/by-id/usb-Klipper_stm32f446xx_140046000150335331383520-if00

 

If you get an error stating that the command cannot be found (or something like it), then follow the instructions here to see if this is the issue.

 

Link to comment
Share on other sites

Just looking at your original post again. If the error complained about 

On 8/21/2023 at 5:37 AM, pfm said:

Lost communication with MCU 'Can0'." or "mcu 'Can0': Unable to connect".  

it does seem that it is the can network interface. Is you canbus board connected to your Octopus board (usb to can) or a can interface such as a pihat, UTOC, etc?

Having a serial interface in the [mcu] section would indicate that you are using an external bridge.

It may be worth checking those connections between the Raspi and the Cambridge, or try changing the USB cable from the raspi to the bridge

Link to comment
Share on other sites

I spent the past few days rebuilding my entire software environment. I'm following the BTT Build Guide. I reinstalled Mainsail on my pi, and I'm using a UTC to connect to the USB port on the Octopus. I've updated the U2C firmware. I updated everything suggested by Moonraker.  I flashed canboot. 'ip a' gives me:

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000
    link/ether d8:3a:dd:00:4d:d2 brd ff:ff:ff:ff:ff:ff
3: can0: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UP group default qlen 1024
    link/can
4: wlan0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether d8:3a:dd:00:4d:d4 brd ff:ff:ff:ff:ff:ff
    inet 192.168.2.254/24 brd 192.168.2.255 scope global dynamic noprefixroute wlan0
       valid_lft 85358sec preferred_lft 74558sec
    inet6 fdfc:9f30:4389:4957:d49d:f19d:dd66:b0b4/64 scope global dynamic mngtmpaddr noprefixroute
       valid_lft 1650sec preferred_lft 1650sec
    inet6 fe80::4418:b538:935a:f3af/64 scope link
       valid_lft forever preferred_lft forever

'ip -details -statistics link show can0' gives:

3: can0: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1024
    link/can  promiscuity 0 minmtu 0 maxmtu 0
    can state BUS-OFF restart-ms 0
          bitrate 1000000 sample-point 0.750
          tq 62 prop-seg 5 phase-seg1 6 phase-seg2 4 sjw 1
          gs_usb: tseg1 1..16 tseg2 1..8 sjw 1..4 brp 1..1024 brp-inc 1
          clock 64000000
          re-started bus-errors arbit-lost error-warn error-pass bus-off
          0          0          0          1          1          1         numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
    RX: bytes  packets  errors  dropped missed  mcast
    8800       1166     0       0       0       0
    TX: bytes  packets  errors  dropped carrier collsns
    1528       280      0       0       0       0

But, after I compile the firmware and run 'python3 flash_can.py -i can0 -q' I get:

paul@trident:~/CanBoot/scripts $ python3 flash_can.py -i can0 -q
Resetting all bootloader node IDs...
Checking for Katapult nodes...
Query Complete

It does not give me a UUID.  At this point I stumped.  

Link to comment
Share on other sites

22 hours ago, mvdveer said:

Run the following command and see if it shows your serial of the board - make sure this is the one you have in the printer.cfg file

ls -al /dev/serial/by-id

It should return:

/dev/serial/by-id/usb-Klipper_stm32f446xx_140046000150335331383520-if00

If you get an error stating that the command cannot be found (or something like it), then follow the instructions here to see if this is the issue.

With my latest software installation, I do see the serial device. 

paul@trident:~ $ ls -al /dev/serial/by-id
total 0
drwxr-xr-x 2 root root 60 Aug 23 13:17 .
drwxr-xr-x 4 root root 80 Aug 23 13:17 ..
lrwxrwxrwx 1 root root 13 Aug 23 13:17 usb-Klipper_stm32f446xx_140046000150335331383520-if00 -> ../../ttyACM0

 

Link to comment
Share on other sites

2 hours ago, pfm said:

I'm using a UTC to connect to the USB port on the Octopus.

First make sure klipper is NOT running - I have found that this interferes with getting the UUID's 

(The BTT GitHub also reports this to be the case)

image.thumb.png.02815dd7ef991d31f7fa2d2eaffcf772.png

sudo service klipper stop

You can also temporarily Comment out (#) the [mcu] sections in the printer.cfg file

Make sure the jumper cap is applied to the 120ohm resistor.

Then retry getting the UUID's again

If that does not work then maybe the UTC did not flash properly.

If the connections are correct, then that means something went wrong in the flashing of the UTC. Your can0 interface is up and running and is not the issue. One of the frustrations with Canbus - sometimes takes multiple attempts to flash the board correctly.

My setup is different as I use the Octopus board as the USB to Can Bridge, and not a UTC as the bridge. Will look into the process of using/flashing the UTC board. 

  • Like 1
Link to comment
Share on other sites

47 minutes ago, mvdveer said:

First make sure klipper is NOT running - I have found that this interferes with getting the UUID's 

Killer Klipper, same result, no UUID.

Hmm, but in my setup I connected my Pi to USB IN on the UTC and connected another PI USB port to the Octopus. Purhaps that's my problem.  I just tried simply switching the USB connections around, but that gave me:

  

paul@trident:~/CanBoot/scripts $ python3 flash_can.py -i can0 -q
ERROR:root:Flash Error
Traceback (most recent call last):
  File "/home/paul/CanBoot/scripts/flash_can.py", line 493, in run_query
    self.cansock.bind((intf,))
OSError: [Errno 19] No such device

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/paul/CanBoot/scripts/flash_can.py", line 621, in main
    loop.run_until_complete(sock.run_query(intf))
  File "/usr/lib/python3.9/asyncio/base_events.py", line 642, in run_until_complete
    return future.result()
  File "/home/paul/CanBoot/scripts/flash_can.py", line 495, in run_query
    raise FlashCanError("Unable to bind socket to can0")
FlashCanError: Unable to bind socket to can0

So I think I need to reflash everything.  I'll give that a go.

Link to comment
Share on other sites

1 minute ago, pfm said:

Hmm, but in my setup I connected my Pi to USB IN on the UTC and connected another PI USB port to the Octopus

This is the connection as I understand it.

how-do-i-connect-a-main-board-u2c-can-bus-usb-port-and-not-v0-7gmunejr4e3a1.jpg.thumb.webp.e848da40cb5fa8ece7851e1bf6b9c2b6.webp

  • Like 1
Link to comment
Share on other sites

21 hours ago, pfm said:

Killer Klipper, same result, no UUID.

Hmm, but in my setup I connected my Pi to USB IN on the UTC and connected another PI USB port to the Octopus. Purhaps that's my problem.  I just tried simply switching the USB connections around, but that gave me:

I can confirm that CAN works well with the U2C connected to a RP4 USB3 port (blue) and the Octopus connected to the adjacent USB2 (black) port.

IMG_4325.jpg.ae036e7d1d2177e130c2a84ee5fbf9e3.jpg

Edited by ChicagoKeri
added pic
  • Like 2
Link to comment
Share on other sites

  • 3 weeks later...

OK, wow, this was quite the odyssey. I was seeing an accumulation of communication errors that would cause klipper to shut down.  I attached a logic analyzer and was seeing bad packets caused by voltage issues between CAN Hi and CAN Lo. 

The EBB2209 board has a daughter board that sits in the Stealthburner shroud. If I removed this board, all of my communication errors disappeared, and my printer was happy.

It turns out that the daughter board has some solder nibs protruding out the back of the board by a small amount.  These nibs were making contact with a metallic label on the 5015 part cooling fan. This was causing a very high resistance short across the pins. I removed the label and the CAN errors disappeared.  

I know a lot of people are reporting phantom low voltage issues with the 2209/2240. I wonder if these have the same root cause. 

  • Like 4
Link to comment
Share on other sites

  • 1 month later...

After getting my new LDO Voron 2.4 running with the original stock wiring harness to the tool head, using an Octopus MCU and a Raspberry Pi4, all was good. Then I thought why not upgrade to CANBus. My first attempt was with the Mellow Fly SB2040. It just wouldn't work. After about a month of all kinds of testing and following every YouTube video I could find, the problem came down to the RPi firmware. Since my new PRi4 was 64 bit, I installed the 64 bit OS. But as it turns out, the CANBus would only work with the 32 bit RECOMMENDED RPi OS version. It would probably work with the 64 bit OS given some magical tweaks, but that is beyond my experience level.

After getting the Fly SB2040 running, I decided to upgraded to the EBB SB2240 because it has more fan ports and fan voltage options. This board includes ports for Chaotic Tap, Heater fan rpm monitoring, and an extra fan port I used for cooling the SB2240, which had to be wired to the daughter board. It works great!

One note, to get the Fan rpm signal to work with the Fly SB2040, it requires a BAT85 diode and a 10K pull up resistor on the signal port. But, with the EBB SB2240 board, no board modes were required.

  • Like 2
Link to comment
Share on other sites

TimGehres,  Congratulations on getting CAN to work!

I went the other direction, from BTT EBB SB2209 to the Mellow FLY SB2040 because of persistent thermistor failures across two different BTT boards... they would consistently fail on a subsequent print after a long 5+ hour print. Switching to the Mellow SB2040 solved this for me.

I'm interested to hear how your thermistors and heaters hold up, especially after some long prints.

 

 

Link to comment
Share on other sites

20 minutes ago, ChicagoKeri said:

Switching to the Mellow SB2040 solved this for me.

Been using these exclusively but had to mount a Mellow Fly SHT36v2 on the micron and the recent VZBot.  Have done 17 hour+ prints on the machines with the SB2040's and no issues so far. Have done a 13 hour print on the SHT36v2 without failures. Have a couple of EBB 2209 and EBB2040 but hesitant to use them after your experience  with them

Link to comment
Share on other sites

 

Hi ChicagoKeri,

First of all, the Voron 2.4 is my first entry into the world of 3D Printing. So there is a steep learning curve, as you well know. My hardware configurations has been an evolutionary process as I discover new Mods. I switched to the EBB SB2240 because working with the Fly SB2040 was so frustration and the last straw came when I smoked the HE fan. It was at that time I discovered the EBB SB2240, much easier to configure.

I think the SB main bd cooling fan may help as it lowers the bd temperature about 15 degrees. https://www.printables.com/model/485179-stronger-hex-ventilated-sb2209-sb2240-door-with-30

Two mods I installed on the Fly SB2040 were the ChaoticLab Tap and the three wire hot end cooling fan for RPM Monitoring. Pictures included.

One other mod I was going to install was a SB2040 cooling fan, but change out to the SB2240 before getting that far.

For the Tap, I used the +5 and Gnd from the ENDSTOP(with gpio28 pin) and ran the Tap signal to gpio25 pin.

Here are the instructions for the HE FAN with RPM signal. In the picture "RPM Wiring" the BAT85 is under the yellow heat shrink.

# NOTE: A OD4010-24HB01A Orion Fan was wired to Hotend Fan1 pads on the SB2040-F board, as follows:
#       Fan Tach pin is wired to PWM0
#       Fan Signal pin is wired to AGND
#       Fan Power pin is wired to +24v
#
#       On the SB2040 main board, the PWM0 header pin from the daughter bd is cut in two on the main bd. (This header pin is pictured in the RPM Wiring photo where the white wire and resistor are soldered together to the header pin. What is not visible is where the header pin was cut, which was below that soldered joint.) The side of the pin coming from the
#       SB2040-F daughter board is wired to the ENDSTOP Plug, GPIO-29, with an inline BAT85 diode with the band away from the main bd plug
#       A 10K pullup resistor is also connected to the PWM0 pin and to the Vcc pin on the header.
#
#       In the fly-SB2040.cfg, change LIMIT_1 to gpio29 and LIMIT_2 to gpio28 like this:
aliases_endstop:
    LIMIT_0=gpio25,LIMIT_1=gpio29,LIMIT_2=gpio28                 # LIMIT_1 = Hot End fan tachometer, gpio29

 

Ref:

https://ellis3dp.com/Print-Tuning-Guide/articles/useful_macros/hotend_fan_monitoring.html

https://mellow.klipper.cn/#/board/fly_sb2040/pins

 

# NOTE: The Hotend Fan is a OD4010-24HB01A Orion 24v Fan.
pin: SB2040: PWM0                               # Hotend fan plugs into VFAN1, signal is gpio16
tachometer_pin: SB2040: LIMIT_1         # Hotend fan tach comes from 4WFAN TACH gpio29
tachometer_ppr: 2
tachometer_poll_interval: 0.0015
heater: extruder
heater_temp: 50.0

 

 

 

RPM Wiring.jpeg

Chaotic Tap.jpeg

Daughter bd Wiring.jpeg

  • Like 2
Link to comment
Share on other sites

  • 2 weeks later...
On 9/11/2023 at 7:27 AM, pfm said:

OK, wow, this was quite the odyssey. I was seeing an accumulation of communication errors that would cause klipper to shut down.  I attached a logic analyzer and was seeing bad packets caused by voltage issues between CAN Hi and CAN Lo. 

The EBB2209 board has a daughter board that sits in the Stealthburner shroud. If I removed this board, all of my communication errors disappeared, and my printer was happy.

It turns out that the daughter board has some solder nibs protruding out the back of the board by a small amount.  These nibs were making contact with a metallic label on the 5015 part cooling fan. This was causing a very high resistance short across the pins. I removed the label and the CAN errors disappeared.  

I know a lot of people are reporting phantom low voltage issues with the 2209/2240. I wonder if these have the same root cause. 

Could you share a few pics of the issue? I face exactly what you have described till this post... Unfortunately I do not have a logic analyzer to troubleshoot further😂

Link to comment
Share on other sites

Sooo, the trouble returned a few days ago.  Clearly, it was not the metallic label that was causing the issue. But, in all cases, when I removed the daughter card, the trouble went away.  When I reconnected the daughter card, the trouble returned. I then removed the screws that attached the card to the housing, but left the daughter card plugged in, and the trouble went away.  When I put the screws back in, the trouble returned. I further isolated it to one of the two screws (the one farther from the header). 

I suspect that the copper ground plane and one of the signal traces of the PCB are exposed in the side wall of the screw hole in the PCB and the screw creates a short. This is likely either a design defect in the SB0000 or a manufacturing defect with this particular board. I've been running it without this screw for the past few days and have had no problems. 

 Photo1.thumb.jpg.208e07f4f74779d5d9866baedce15b93.jpg

  • Like 3
Link to comment
Share on other sites

On 10/22/2023 at 7:00 PM, pfm said:

Sooo, the trouble returned a few days ago.  Clearly, it was not the metallic label that was causing the issue. But, in all cases, when I removed the daughter card, the trouble went away.  When I reconnected the daughter card, the trouble returned. I then removed the screws that attached the card to the housing, but left the daughter card plugged in, and the trouble went away.  When I put the screws back in, the trouble returned. I further isolated it to one of the two screws (the one farther from the header). 

I suspect that the copper ground plane and one of the signal traces of the PCB are exposed in the side wall of the screw hole in the PCB and the screw creates a short. This is likely either a design defect in the SB0000 or a manufacturing defect with this particular board. I've been running it without this screw for the past few days and have had no problems. 

 

Good find!

As I have two of these SB0000 boards lying around, I grabbed an ohmmeter and tried to find any connection between either screw hole and any connection on the board and neither board shows any kind of connection to the screw hole. Even the ground plane shows no continuity through  the screw hole and the solder mask insulates even that all the way up to the hole. The ground plane shouldn't mind at all if the screw contacts it.

The CAN bus data lines and a 5v line go to the connector closest to your removed screw, are easily visible on the backside of the board and do not go anywhere near the screw hole. Only the ground plane appears to intersect the screw hole.

The data lines CAN H and CAN L are very thin and very close together.  Perhaps the board is flexing when the screw is tightened and provoking a CAN H to CAN L short circuit?    Could anything be shorting out the 4 pin data connector near your removed screw?

By the way, I purchased the 2nd board because of thermistor issues with the first one, and then the second one also.  The CAN bus has always worked well for me on either board.

Edited by ChicagoKeri
btw...
Link to comment
Share on other sites

And I thought I had a tough time getting the Canbus working on my setup...lol  Having spent many years dealing with low voltage wiring working in the alarm industry I know the trials and tribulations surrounding tracking down a partial short or some-such thing or another.  My hats off to you for the persistence in nailing it down.  I'd also agree with the "bending" of the board being the more likely culprit; I've run into that in more than one situation with pcb's in tight places.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...