Page 46 of 48 FirstFirst ... 364445464748 LastLast
Results 676 to 690 of 714

Thread: New oleg firmware version

  1. #676
    Join Date
    Nov 2006
    Location
    Russia, Moscow
    Posts
    3,640
    Quote Originally Posted by wpte View Post
    I've transferred over 400GB already, but I did noticed some hiccups.
    EXT4 seems to work well overall, good performance etc.
    But after a while (seems almost random) I get a segmentation fault in cp, the drive becomes inaccessible and the router is not able to umount it, even when the drive is removed.
    I don't see any oops, maybe I should turn extra debugging on?
    The router is also not able to reboot after the segfault.
    In case of you can't reboot, it seems to be kernel problem.
    It can be either ext4(jbd2) specific or common writeback/bdi interfaces problem. I never hear that somebody did such stress-test on router, 400Gb is really huge amount of data!

    Best of all, if you able to prepare reproducible minimal test-case.
    If problem is ext4 specific, than switching to ext3 will help. To enable ext4 debugging you have to recompile ext4.ko, jbd2.ko kernel modules with CONFIG_EXT4_DEBUG=y & CONFIG_JBD2_DEBUG=y

    Also, please try r4652 - it is a small chance that I miss something in latest backports.

    Anyway, "cp" shouldn't crash silently - please try to use gdb to get type of fault.

  2. #677
    Join Date
    Dec 2007
    Location
    The Netherlands - Eindhoven
    Posts
    1,767
    Quote Originally Posted by lly View Post
    In case of you can't reboot, it seems to be kernel problem.
    It can be either ext4(jbd2) specific or common writeback/bdi interfaces problem. I never hear that somebody did such stress-test on router, 400Gb is really huge amount of data!

    Best of all, if you able to prepare reproducible minimal test-case.
    If problem is ext4 specific, than switching to ext3 will help. To enable ext4 debugging you have to recompile ext4.ko, jbd2.ko kernel modules with CONFIG_EXT4_DEBUG=y & CONFIG_JBD2_DEBUG=y

    Also, please try r4652 - it is a small chance that I miss something in latest backports.

    Anyway, "cp" shouldn't crash silently - please try to use gdb to get type of fault.
    pff 400gig is nothing
    But that's a total I copied. First I copied about 70GB when cp crashed, then after a hard-reboot I copied the rest without any problem.
    I didn't found any corruption of the filesystem btw.

    I remember that I had an incident with an other drive as well (a test ext4 drive). It was just spun down for a long time, it worked for days but then I wasn't able to umount or access it. I'm not sure if this is a stress problem.

    I will run the cp command with gdb this time, and try the latest revision

  3. #678
    Join Date
    Dec 2007
    Location
    The Netherlands - Eindhoven
    Posts
    1,767
    Quote Originally Posted by lly View Post
    To enable ext4 debugging you have to recompile ext4.ko, jbd2.ko kernel modules with CONFIG_EXT4_DEBUG=y & CONFIG_JBD2_DEBUG=y

    I get an error when I try that:
    Code:
    fs/jbd2/journal.c: In function 'journal_free_journal_head':
    fs/jbd2/journal.c:1880:13: error: 'JBD2_POISON_FREE' undeclared (first use in this function)
    fs/jbd2/journal.c:1880:13: note: each undeclared identifier is reported only once for each function it appears in
    I've just set it with make menuconfig in the kernel after all the patches where applied (after gateway directory was created)

  4. #679
    Join Date
    Nov 2006
    Location
    Russia, Moscow
    Posts
    3,640
    Quote Originally Posted by wpte View Post
    I get an error when I try that:
    Sorry, I must check it myself before to give advice to you. Will fix it ASAP.

    Updated: Fixed in r4703
    Last edited by lly; 09-11-2012 at 15:49.

  5. #680
    Join Date
    Dec 2007
    Location
    The Netherlands - Eindhoven
    Posts
    1,767
    Quote Originally Posted by lly View Post
    Sorry, I must check it myself before to give advice to you. Will fix it ASAP.

    Updated: Fixed in r4703
    allright, I've loaded up the modules with debugging, and ran cp in gdb.
    With gdb this is all I've got:
    Code:
    Program terminated with signal SIGSEGV, Segmentation fault.
    The program no longer exists.
    I've attached the dmesg output. not sure if the debugging option helped that much:
    dmesg.txt
    2 oops's are logged.

    I just noticed I also get quite a bit of IPv6 ICMP spoofing attempts
    Last edited by wpte; 10-11-2012 at 16:57.

  6. #681
    Join Date
    Nov 2006
    Location
    Russia, Moscow
    Posts
    3,640
    Quote Originally Posted by wpte View Post
    allright, I've loaded up the modules with debugging, and ran cp in gdb.
    With gdb this is all I've got:
    Code:
    Program terminated with signal SIGSEGV, Segmentation fault.
    The program no longer exists.
    I've attached the dmesg output. not sure if the debugging option helped that much:

    2 oops's are logged.
    Thanks for info, looks like unaligned access cause wrong destination address calculation. Very strange that stack trace seems to be incomplete - memdup_user() can't be called from userspace directly. I will try to dig more deeper...

    Could you provide output of
    Code:
    grep unaligned /proc/cpuinfo
    ?

  7. #682
    Join Date
    Dec 2007
    Location
    The Netherlands - Eindhoven
    Posts
    1,767
    Quote Originally Posted by lly View Post
    Thanks for info, looks like unaligned access cause wrong destination address calculation. Very strange that stack trace seems to be incomplete - memdup_user() can't be called from userspace directly. I will try to dig more deeper...

    Could you provide output of
    Code:
    grep unaligned /proc/cpuinfo
    ?
    Code:
    unaligned_instructions  : 687
    Sorry, I rebooted my router, so I reproduced the problem.
    Each time I execute that grep the unaligned instructions increase by 3 btw.

  8. #683
    Join Date
    Nov 2006
    Location
    Russia, Moscow
    Posts
    3,640
    Quote Originally Posted by wpte View Post
    Code:
    unaligned_instructions  : 687
    Sorry, I rebooted my router, so I reproduced the problem.
    Each time I execute that grep the unaligned instructions increase by 3 btw.
    Strangely, when I use built-in /bin/grep - it not increases. Value of unaligned instructions on my router less than 10...

  9. #684
    Join Date
    Dec 2007
    Location
    The Netherlands - Eindhoven
    Posts
    1,767
    Quote Originally Posted by lly View Post
    Strangely, when I use built-in /bin/grep - it not increases. Value of unaligned instructions on my router less than 10...
    Well on the RT-AC66U with stock firmware it's: 2873311
    On there I've no disk attached or anything


    I just checked the build-in grep and noticed that it didn't increase.
    When I used the optware grep it increased again.
    Right now it's >900 but the drive is still working.
    When I rmmod the ext4, jdb2 and crc16 module it still behaves with a +3 increase.


    Is the optware software compromised maybe?

  10. #685
    on r3323 unaligned instructions are increasing all the time:

    Code:
    Every 1.0s: cat /proc/cpuinfo |grep unalign                                                                                                                                                                          Sun Nov 11 16:24:53 2012
    
    unaligned_instructions  : 88652442
    unaligned_instructions  : 88653829
    unaligned_instructions  : 88654494
    i have l2tp connection and openvpn daemons running

  11. #686
    Quote Originally Posted by staticroute View Post
    on r3323 unaligned instructions are increasing all the time:

    Code:
    Every 1.0s: cat /proc/cpuinfo |grep unalign                                                                                                                                                                          Sun Nov 11 16:24:53 2012
    
    unaligned_instructions  : 88652442
    unaligned_instructions  : 88653829
    unaligned_instructions  : 88654494
    i have l2tp connection and openvpn daemons running
    in short, l2tp unaligned access was fixed long time ago, but after r3323.

  12. #687
    Join Date
    Dec 2007
    Location
    The Netherlands - Eindhoven
    Posts
    1,767
    no crashes so far ever since I updated to r4716
    2 days without a crash, and I copied maybe 300GB.
    Anyway, let's hope it'll stay that way

  13. #688
    Join Date
    Nov 2006
    Location
    Russia, Moscow
    Posts
    3,640
    Quote Originally Posted by wpte View Post
    no crashes so far ever since I updated to r4716
    2 days without a crash, and I copied maybe 300GB.
    Anyway, let's hope it'll stay that way
    Thanks for good news. It is due to r4715,r4716 I hope.

  14. #689
    Join Date
    Dec 2007
    Location
    The Netherlands - Eindhoven
    Posts
    1,767
    Quote Originally Posted by lly View Post
    Thanks for good news. It is due to r4715,r4716 I hope.
    Still going strong. I've been using my 4TB EXT4 disk as primary storage location for a few days now.
    Soon I'll make a switch for /opt as well to an ssd so I'll have less wear on my big harddrive

    I've no complaints but I'd like to give some feedback performance-wise.
    I still use the debug-enabled modules maybe that's what causing extra load, but this is my load graph from the last week:
    Name:  graph_image.png
Views: 780
Size:  59.5 KB
    You can clearly see a load increase while I didn't do anything different than usual.
    Again no complaints from my side, I'm happy with EXT4, but it's interesting for people who may want to switch.

    In the weekend I'll start using the debug-less modules again I guess, since everything has been running smooth.

  15. #690
    Join Date
    Nov 2006
    Location
    Russia, Moscow
    Posts
    3,640
    Quote Originally Posted by wpte View Post
    You can clearly see a load increase while I didn't do anything different than usual.
    My guess that is is due to our old bdi(backing device interface) kernel subsystem. Modern kernels has per-device bdi, our still use single kthread. Hope, I able to backport per-device bdi in futer, but it requires a lot of time & efforts.

Page 46 of 48 FirstFirst ... 364445464748 LastLast

Similar Threads

  1. Probleme mit der Oleg firmware
    By errox in forum German Discussion - Deutsch (DE)
    Replies: 15
    Last Post: 14-06-2008, 22:26
  2. new firmware 1.9.2.7-8 by oleg
    By alien433 in forum WL-500gP Firmware Discussion
    Replies: 31
    Last Post: 24-01-2008, 20:31
  3. Oleg firmware not working.
    By wpte in forum WL-500gP Q&A
    Replies: 6
    Last Post: 07-01-2008, 12:48
  4. C Compiler voor de oleg firmware
    By wouzs in forum Dutch Discussion - Nederlands
    Replies: 1
    Last Post: 28-10-2007, 15:57

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •