Forums :
General Topics :
Jobs are being created with VT-x / AMD-V disabled
Message board moderation
Author | Message |
---|---|
Evans CAH Send message Joined: 19 Apr 09 Posts: 7 Credit: 373,938 RAC: 0 |
All Cosmology jobs are being created with the VT-x / AMD-V flag reset. This makes running the project more or less impossible. The jobs run, but they clobber the machine so badly that it is unusable. If I set the flag manually for each waiting VM, they and other VMs run normally. This may not be a Cosmology problem - I think Atlas was doing the same thing. I can't verify that right now as there is no Atlas work available. The only mention of VT-x in any log is in the VM log: 00:00:02.108645 HM: VT-x/AMD-V init method: LOCAL but this entry looks the same for VMs with the flag set. I don't see way to enforce this setting globally. This is an AMD box. Anyone else seeing this? Google says not. VB 6.02 BOINC 7.14.2 & 7.15 |
Jonathan Send message Joined: 27 Sep 17 Posts: 190 Credit: 8,338,009 RAC: 3 |
I am not seeing any unusual problems with my AMD computer. I looked at your tasks and it looks like you recently aborted about 5 tasks using the GUI. The error in the log was: 2019-02-05 11:56:24 (2664): ERROR: Vboxwrapper lost communication with VirtualBox, rescheduling task for a later time. https://www.cosmologyathome.org/result.php?resultid=5272870 Is this the error you are having and you have to abort the work units or is it on different work units? Do you have an example work unit? I am not sure what you mean by the VT-x / AMD-V flag problem. Can you explain where you are seeing this this problem? Is it in the logs, the VirtualBox software, etc? I forgot to add I am on VirtualBox 5.2.24. Ocassionally I get the 'lost communication problem' but they seem to have gone down since I limited Cosmology@home to the true number of processors I have (eight) total using app_config.xml |
Jonathan Send message Joined: 27 Sep 17 Posts: 190 Credit: 8,338,009 RAC: 3 |
I just upgraded VirtualBox to 6.0.4 and will let it run. I set 'no new tasks' and let the queue run out before upgrading. I only have cosmology running. |
Evans CAH Send message Joined: 19 Apr 09 Posts: 7 Credit: 373,938 RAC: 0 |
Thanks for the feedback. I am seeing it in the VirtualBox interface under Settings/System/Processor. Every Cosomology VM has AMD-V disabled. While the VMs are running, the machine is unusable. The flag can be manually set, and the setting sticks. I probably aborted jobs that are 'unmanageable'. This has nothing to do with AMD-V. I should add I am using an app_config to limit each VM to two CPUs. Anyone else on VB 6.02? |
Jonathan Send message Joined: 27 Sep 17 Posts: 190 Credit: 8,338,009 RAC: 3 |
"Enable Nested VT-x / AMD-V" is unchecked in Virtual Box. Is that what you are referring to? That is not checked on any of the work units for Cosmology and isn't used. You only need it on the 'host' and not in the 'guest' I think your problem is running too many Cosmology work units at once. Try changing your 'app_config.xml' to run one concurrent task of either two or three CPU cores. Virtual Box is running at a higher priority that the regular Boinc tasks that don't use Virtual Box. |
Evans CAH Send message Joined: 19 Apr 09 Posts: 7 Credit: 373,938 RAC: 0 |
That's what I'm referring to, yes. I just realized this setting enables nested hardware virtualization, new in VB 6.0, which obviously isn't needed. BOINC is correctly assigning CPUs up to the maximum. After a bit of experimentation, I think my issue is exactly what you describe: that I (and BOINC) can't set the scheduler priority of individual VMs. On this box I am using VB for non-BOINC stuff that needs to be responsive. I can't just demote the VB service to a lower priority because that will demote all VMs, BOINC and non-BOINC. I will try your app_config fix, but last I checked all Cosmology jobs had been 'unmanageable' for days and I had to humanely kill them. For what it's worth there is another workaround, which is to allow BOINC network access only at night, when the machine doesn't have fussy users on it. The VM jobs won't run without internet access, so they politely wait. The side-effect is that you have to run a huge job cache so that the machine has something to do during the day. |
Jonathan Send message Joined: 27 Sep 17 Posts: 190 Credit: 8,338,009 RAC: 3 |
I think Virtual Box 5.2.8 isn't as fussy on the 'lost communication' issues. There are posts at the LHC@home forums on it too. I only had that issue when I was running more that two virtual box jobs at once. Link to my forum post at bottom. You might be better off by just sticking to the conventional Boinc applications that don't use Virtual Box. I think that Cosmology@home can run the virtual box jobs without network access https://www.cosmologyathome.org/forum_thread.php?id=7615 |
Evans CAH Send message Joined: 19 Apr 09 Posts: 7 Credit: 373,938 RAC: 0 |
Related issue: a bug in the BOINC client meant that once system-level virtualization was switched off (e.g. after a BIOS flash or reset) BOINC would then assume it was off forever. The effect was that VB projects would be ignored irrespective of the firmware setting. This was fixed in March 2020 in client 7.16 (see https://boinc.berkeley.edu/wiki/Release_Notes) |
![]() Send message Joined: 19 Jul 18 Posts: 6 Credit: 16,577,897 RAC: 44,163 ![]() |
Could any of this be related to why my 3950X won't run more than 1 task at a time? Once upon a time I installed oracle VM Box on this pc and was using it to crunch Uiniverse@Home. I now no longer us VM. Would it have anything to do with the settings in my BIOS which I had to change regarding virtualisation? Oh, I do not have this machine set to "work" or "home" profile. I checked that. Edit - All good, I fixed it. Removed VM from my pc. ![]() ![]() |