0001274: Accessing CAN locks up the system

ID	Project	Category	View Status	Date Submitted	Last Update

0001274	fss5pv210_Linux	CAN	public	2012-08-24 19:09	2014-01-28 11:44

Reporter	Keller	Assigned To
Priority	normal	Severity	minor	Reproducibility	always
Status	resolved	Resolution	fixed
Product Version	armStoneA8-V1.1
Target Version	fss5pv210-V2.0	Fixed in Version	fss5pv210-V2.0

Summary	0001274: Accessing CAN locks up the system
Description	If a CAN message is sent or received, the whole system freezes.

Forum Link

Keller 2012-08-24 19:46 manager ~0000781	The MCP2515 uses a low-level interrupt line. So basically the IRQ used on the driver should also be low-level triggered. The mcp251x driver uses an interrupt service thread (IST) to handle the CAN interrupt. In addition to the thread, there should also be an interrupt handler that removes the interrupt source and then returns IRQ_WAKE_THREAD to delay the main work to the IST. However removing the interrupt source would mean in our case masking the interrupt of the GPIO pin, because removing the real source in the CAN controller would involve an SPI cycle that can not be done in hard interrupt context. But there is no infrustructure available in the kernel to just mask the interrupt. We can only call disable_irq() to do this. However the IST will verify this and won't do its job then. So if we have a low-level interrupt and we switch off the IRQ in the handler, we don't get our interrupt handled in the IST. And if we keep the IRQ running (as does the default handler in the kernel if we don't provide our own handler in the driver), we are stuck in an endless loop because the handler is called forever if the source is not removed. The standard driver solves this dilemma by using a falling-edge triggered interrupt. Then we're not stuck in an endless loop. This is the solution that we also have implemented now. However I'm not sure if this can cause a race condition where we miss an interrupt if a second falling edge is detected while we're still in the IST. Then the handler is called again but if this really calls the IST again after it returns from the current task I can not tell. Another idea is to switch the IRQ to falling-edge while handling the IST to avoid the endless loop, and later switch it back to low-level if the IST returns. But this will trigger the handler always twice, because a low-level interrupt always keeps the interrupt pending flag set. So the handler is called with low-level interrupt, the pending flag is already set again, the handler sets the interrupt to falling edge and returns. Then it is called once again because of the pending flag and returns a second time. Now no new interrupt is pending and the IST can do its work. Because of this two-times calling, this is also no perfect solution. The best thing would be to have a function to mask a GPIO interrupt without officially disabling it at the same time. There exist the functions mask_irq()/unmask_irq() in kernel/irq/chip.c but at a first glance they can not be called from a driver. But maybe we can check this again if the version with falling-edge interrupt is not working reliably.

View Issue Details

Activities