The following warnings occurred:
Warning [2] Undefined array key "avatartype" - Line: 783 - File: global.php PHP 8.0.30 (Linux)
File Line Function
/global.php 783 errorHandler->error
/printthread.php 16 require_once
Warning [2] Undefined array key "avatartype" - Line: 783 - File: global.php PHP 8.0.30 (Linux)
File Line Function
/global.php 783 errorHandler->error
/printthread.php 16 require_once
Warning [2] Undefined variable $awaitingusers - Line: 36 - File: global.php(844) : eval()'d code PHP 8.0.30 (Linux)
File Line Function
/global.php(844) : eval()'d code 36 errorHandler->error
/global.php 844 eval
/printthread.php 16 require_once
Warning [2] Undefined array key "style" - Line: 909 - File: global.php PHP 8.0.30 (Linux)
File Line Function
/global.php 909 errorHandler->error
/printthread.php 16 require_once
Warning [2] Undefined property: MyLanguage::$lang_select_default - Line: 5132 - File: inc/functions.php PHP 8.0.30 (Linux)
File Line Function
/inc/functions.php 5132 errorHandler->error
/global.php 909 build_theme_select
/printthread.php 16 require_once
Warning [2] Undefined array key "additionalgroups" - Line: 7288 - File: inc/functions.php PHP 8.0.30 (Linux)
File Line Function
/inc/functions.php 7288 errorHandler->error
/inc/functions.php 5152 is_member
/global.php 909 build_theme_select
/printthread.php 16 require_once
Warning [2] Undefined array key "additionalgroups" - Line: 7288 - File: inc/functions.php PHP 8.0.30 (Linux)
File Line Function
/inc/functions.php 7288 errorHandler->error
/inc/functions.php 5152 is_member
/global.php 909 build_theme_select
/printthread.php 16 require_once
Warning [2] Undefined array key "showimages" - Line: 160 - File: printthread.php PHP 8.0.30 (Linux)
File Line Function
/printthread.php 160 errorHandler->error
Warning [2] Undefined array key "showvideos" - Line: 165 - File: printthread.php PHP 8.0.30 (Linux)
File Line Function
/printthread.php 165 errorHandler->error
Warning [2] Undefined array key "showimages" - Line: 160 - File: printthread.php PHP 8.0.30 (Linux)
File Line Function
/printthread.php 160 errorHandler->error
Warning [2] Undefined array key "showvideos" - Line: 165 - File: printthread.php PHP 8.0.30 (Linux)
File Line Function
/printthread.php 165 errorHandler->error
Warning [2] Undefined array key "showimages" - Line: 160 - File: printthread.php PHP 8.0.30 (Linux)
File Line Function
/printthread.php 160 errorHandler->error
Warning [2] Undefined array key "showvideos" - Line: 165 - File: printthread.php PHP 8.0.30 (Linux)
File Line Function
/printthread.php 165 errorHandler->error
Warning [2] Undefined array key "showimages" - Line: 160 - File: printthread.php PHP 8.0.30 (Linux)
File Line Function
/printthread.php 160 errorHandler->error
Warning [2] Undefined array key "showvideos" - Line: 165 - File: printthread.php PHP 8.0.30 (Linux)
File Line Function
/printthread.php 165 errorHandler->error
Warning [2] Undefined array key "showimages" - Line: 160 - File: printthread.php PHP 8.0.30 (Linux)
File Line Function
/printthread.php 160 errorHandler->error
Warning [2] Undefined array key "showvideos" - Line: 165 - File: printthread.php PHP 8.0.30 (Linux)
File Line Function
/printthread.php 165 errorHandler->error



jeplus.org forums
Issue with project validation on large project with jEPlus+NET (solved) - Printable Version

+- jeplus.org forums (http://jeplus.org/mybb)
+-- Forum: Building simulation tools (http://jeplus.org/mybb/forumdisplay.php?fid=1)
+--- Forum: jEPlus (http://jeplus.org/mybb/forumdisplay.php?fid=2)
+--- Thread: Issue with project validation on large project with jEPlus+NET (solved) (/showthread.php?tid=14)



Issue with project validation on large project with jEPlus+NET (solved) - oddovalspin - 11-04-2014

Hi,

I am trying to run parametric simulations of 12 parameters with 5 values each for a Monte Carlo analysis. This is a test run for a future sample of about 20 parameters of about 30 values each.

I have run a sample of 150 simulations previously using jEPlus1.5 and this took about 5 hours on my i7 Laptop which can run 8 simultaneous threads. I am looking at several thousand simulations for my calibration project and need to scale up my processing power. I have access to a number of redundant but still quite powerful computers so Yi suggested I use jEPlus+NET.

jEPlus+NET1.2 is installed on four (x eight thread) machines (until I can scrounge some more) one of which is working as the server. jEPlus+NET has worked on a small single sample job which simulated on an execution node correctly.

My problem is when I try to scale up the number of simulations. My first attempt at 12 parameters crashed so I have started adding parameters to the tree and validating the project after each new parameter. The project validated fine at  1 parameter (5 jobs) 2 parameters (25 jobs) 3 parameters (125 jobs) 4 parameters (625 jobs)... 7 jobs (78125 jobs) took about 1 second to validate and 8 parameters (390625 jobs) took 4.5 seconds. When I try to add the ninth parameter, jEPlus+NET hangs for a while, prints "validating project, hang on..." and then hangs indefinitely.

I don't want to run hundreds of thousands of simulation - I will be running a Latin Hypercube Sample, but jEPlus+NET won't get as far as starting the server.

I wondered if jEPlus+NET1.2 is using an older "engine" than jEPlus1.5 or perhaps there are some settings I can change.

Thanks in advance,

Regards, David.


RE: Issue with project validation on large project with jEPlus+NET - Yi - 11-04-2014

Hi David,

The early versions of jEPlus actually create all the jobs during the validation process - a silly thing to do, and you can imagine why it falls over when the project is big. There isn't a later release of jEPlus+Net after 1.2. However, you can try this hack and see if it works.
Go to SourceForge and download jEPlus v1.3 build 05. Extract the jEPlus.jar file from the package, and use it to override the one in the lib\ folder of jEPlus+Net. Then, cross your fingers...

Cheers,

Yi


RE: Issue with project validation on large project with jEPlus+NET - oddovalspin - 11-05-2014

Thanks for that Yi.

I downloaded jEPlus v1.3 build 05. In \lib, I renamed jEPlus.jar as jEPlus.jar.old and copied jEPlus from the unzipped folder into \lib. Initially, I didn't make any changes to the jEPlus.jar files on the execute nodes.

The hack gets me past the job validation. JEPlus+NET on node0 (server node) gives me:

Simulation work directories and results will be stored in C:\blah\blah
A LHS sample of 150 has started...
5 Nov 2014 09:27:48 GMT (Agent Job Server) Job server started. 150 jobs to execute. Waiting for Nodes to register.

Then I started the execution nodes. I get a whole pile of comments, typically:

Wed Nov 09:37:35 GMT 2014 [-1] ACMD Wed Nov 05 09:37:46; JOBSERVER_1;192.168.1.1; Command=NODE_UPDATE
Wed Nov 09:37:35 GMT 2014 [-1] ExecNode Manager responded:RCMD[0] Text=ExecNode WN_node1-PC6 started since Wed Nov 05 09:28:58 GMT 2014 is currently WAITING
Total jobs processed:0

(The description repeats for each processor).

Each processor reports:

Wed Nov 05 09:29:09 GMT 2014 [10000262] Connected with server with Serial number: 1000262
Wed Nov 05 09:29:09 GMT 2014 [0] Node WN_node1-PC_6 registered with server 192.168.1.1:2992.
Wed Nov 05 09:29:31 GMT 2014 [0] Sending job (std single) request to Server:2992@node0-PC
Wed Nov 05 09:35:29 GMT 2014 [-99] java.io.EOFException

All processors sit on 100.0% Idle. I assume this last line reports where the process is falling over.

I tried copying the jEPlusv1.3 jEPlus.jar files to \lib on the execute nodes, but the nodes then failed to start so I reset them to the jEPlus+NETv1.2 originals.

Do you have any other suggestions?

Regards, David.


RE: Issue with project validation on large project with jEPlus+NET - Yi - 11-05-2014

Bad luck... so there are deeper incompatibilities between 1.2 and 1.3. Another way I think might work is to use a job list file. If you specify parameter values (instead of indexes) in the job list file, the alt values specified in the parameter tree will be ignored. In this case you can put {1} as the alt values for all parameters, which will solve the validation problem.

To create this job list file, you can use the latest jEPlus with the full project. Do a LHS run as you would, but cancel the simulations as soon as it starts. This will still give you the SimJobIndex.csv file containing the whole sample. You can then edit this file to make a job list:

- Remove the first row and the first column
- Replace the weather file names with their corresponding indexes
- Replace the idf file names with their corresponding indexes
- Save again as CSV

I have actually tried this method and it seems working on my computer. So again, good luck!

Yi


RE: Issue with project validation on large project with jEPlus+NET - oddovalspin - 11-05-2014

OK working now. There are a couple of hooks so I will describe the whole process for anyone else trying this.

I have four redundant PCs set-up on a stand-alone network. Each PC has an Intel i7 CPU which can run up to 8 processes. The PCs are connected to a NetGear GS108T switch via Cat6 patch leads. The PCs are running Windows 7 and are dedicated to EnergyPlus simulations. The operating systems were recently installed. All software is loaded using a flash drive.

The first task was to network the PCs so that they could share data. While the operating system was being installed, usernames and PC names were assigned to each PC. These were node0 on node0-PC, node1 on node1-PC, node2 on node2-PC and node3 on node3-PC. The windows firewalls were deactivated and all files and programs were shared. Remote connections were allowed to all PCs although in practice, I am only using remote access to control the execution nodes 1, 2, and 3 from the master node - node0. One key finding is that you must have passwords for all the usernames. I had previously not chosen passwords for the nodes but Windows Remote Desktop won't work without passwords being set. Also make sure that all the computers are reading the same time on their clocks or Windows won't let the PC join the workgroup. All nodes were set with static IP addresses: 192.168.1.1 to 192.168.1.4

Once I could remote into all the execution nodes from node0, I copied the installation files for EnergyPlus (in my case 7.2.0), Java (jre-7u67-windows-x64) and jEPlus+NETv1.2 into My Documents on each node. Then I installed EnergyPlus and Java and unzipped jEPlus+NET onto each node.
Setting up the jobs is similar to jEPlus. I had already modified my IDF file with my @@tags@@ and set up the meters I need with IDF-Editor. I put my IDF, EPW and MVI files in a subdirectory of jEPlus+NET (one new folder for each simulation). I set up jEPlus+NET by browsing to the location of the three essential files as per usual with jEPlus.

I already had a CSV file from a previous simulation on a stand-alone computer, so as per Yi's recommendations, I deleted the top row and first column using Excel. Since I am only using one weather file and one IDF file, the contents of the first two columns which contain the names of EPW and IDF files were replaced with 0s (the index for the first file). Then I saved the file (as a CSV file) and copied the file into the same directory as the IDF, EPW and MVI files.

As per Yi’s suggestion, I named all the parameters tags for my IDF file in the parameter tree but set the values to {1}, then browsed the "job list in file" field on the Execution tab to the directory with the CSV file, and set the execution controller to Local batch simulation controller.

Before I started the simulations, I minimised jEPlus+NET and started remote desktops for each fo the execution nodes, but did not start the GUIs. After each connection to the execution nodes was established, I minimised the remote access window for later use.

Then I started the simulation by clicking the “Start Simulation” button.

The main jEPlusv1.2 GUI then showed:

Simulation work directories and results will be stored in: C:\Users\node0\Documents\jEPlus+NET_v1.2\output\
Batch started ...
5 Nov 2014 13:27:09 GMT [Agent Job Server] Job server started. 24 jobs to execute. Waiting for Nodes to register...

I then maximised the remote access window for each execution node in turn. On each of the execution nodes I had already created a desktop shortcut for runnode.bat so starting each node was a matter of double clicking on the runnode batch file shortcut. When the node has started, a GUI opens but there is nothing else that needs to be done. There are a number of check boxes on the bottom left hand pane but these do not need to be checked to start the simulations. When all of the nodes had been started the remote connections were minimised and I returned to the server node GUI.

The status of the nodes can be tracked from the Execution tab, by starting the “Show Server Monitor” button. This opens a new window. To get a report on the number of nodes that are running and the status of the jobs, click on “Jobs info”. It takes a few minutes for the nodes to start pulling jobs from the server so wait a little while before you start “Jobs info”. There is also a short delay between calling for status and the reply.

If the simulations are running you get something like:

JEPlus Client MON_node0-PC_0 initialized ...
Sending INQ to Server: 2992@127.0.0.1
Reply received: Jobs summary:
Remaining jobs: 126
Running jobs: 24
Completed jobs: 0
Rejected jobs: 0

There is also a button for detailed information on all the processes on all the nodes called “ExecNodes info” but I found “Jobs info” more useful. Also you have to scroll down to see the results.

One trap seems to be that once the execution nodes have started I can’t run another batch of simulations until the execution controller has been stopped. This needs to be done by clicking on the close window cross (x) on the DOS Command Window as the runnode GUI becomes non-responsive. For good measure I am now restarting all the PCs between runs.

I hope this helps someone else. This is a very power tool.

Regards, David.