We are using Intel Cavecreek chipset of Encryption/Decryption of IPSec data traffic.Of late we have face some issues where either some Encryption/decryption is wrong or altogether all operation where chipset if required fails... For such types of issues, we want to know is there any debug facility/utilitiy which can be used to narrow down the issue root-cause...
To add details, we are are taking about two different issues.
In both cases, we have chassis. Where each card handling traffic is attached with Cavecreek DH89XXc series chipset.
Driver interacts with chipset for encryption/decryption related functionalities. It is layer 3 IPSec solution.
This issues are seen on production sites, where we have very limited access to debugibility.
Problem mentioned below are seen on multiple chassis nodes and multiple times.
Each time problem is resolved with reboot of Hardware.
Here we have multiple SA. While encryption we gave plain packet to chipset and encrypted packet is forwarded to n/w. When issue arises, we see that post decryption first 12 bytes are wrong. Out of these 12 bytes, initial 8 bytes are prepended with some junk and rest 4 bytes overwitten..
Ex. Plain packet was
01 02 03 04 05 06 07 08 ......
If SPI of SA was 0A0B0C0D
Post decryption what we get is
_ _ _ _ _ _ _ _ X X X X 05 06 07 08 ........
Note: As such all works fine and this solution is long deployed and widely used.
Once this issue happens. Even if we deleted SA and install fresh SA. Same issue persists.
Until we do hard reset of card.
Constant 12 bytes corruption is pointing to us some memory corruption in chipset. Because if there is logic problem/memory corruption in application driver. It shall go away with Application restart or SA deletion.
This problem once triggered don't go until card reset is done which leads to chipset reset as well.
Our solution use Cavecreek chipset for various scenario:
1. Dh Group data calculation
2. Encryption/Decryption SA
3. Seqno retrival for particular SA.
We have seen some instance of reporting issue from production site, that on some card, all of sudden all process as mentioned above which involved interaction with Chipset stops working and start returning error code.
Ex: cpaCyDhKeyGenPhase1 returns -2 which is CPA_STATUS_RETRY.
This problem also gets resolved with hard reset of card.
Both this issues points some corruption at chipset level whic gets resolved by re-initialisation of chipset.
Wanted to know whether such issues for cavecreek chipset are already reported.
and if yes, what can be trigger for this scenario.