How to find the problem of a job

Parent Previous Next



There are many possible issues if your job does not render or produces wrong outputs.

These are some hints where you can look for errors.




I. Error List

1. Open the Job Log page in rrControl.

(Select the job and click on the tab "Logs")          


2. Click on "Errors" on the left side.

This is a list of all recognized errors that happened at the job.




Typical error messages:


Message


Description


What to do?

Crashed

Reasons for this issue:


  • The renderer executable crashed completely
  • The client has found an error message inside the log file which indicated an severe render issue 

You have to take a look into the render log files to find the origin of the crash.

If the renderer executable crashed, then it will be printed in the log file after the main render phase. The information is inside a block of ++++ and starts with "++ Render Executable done ++"

Frozen

Reasons for this issue:


  • The render application freezed. It does not continue to render, it does not close, it does not use any CPU. It simply sits there doing nothing
  • The render application opened a dialog box and waits for someone to click a button on the dialog box. (Windows: The client can handle these dialog boxes, but there can be some dialogs the client was not informed about. ) 


Windows: If the renderer opened a dialog box, then it is shown at the end of the render log.

Output Directrory not found

Reasons for this issue:


  • You are probably using a local harddrive to output your renderings. And this path is not accessible by the client/server.
  • You are using a network path, but the permissions are wrong for that path.
  • Your Fileserver is XP, Vista or Win7. And you are using more than 4 machines connected to the fileserver. See fileservers event log for messages.
  • The rrService user was not added to the list of user at your Fileserver/NAS
  • You are rendering cross-OS and the path is not set in the OS conversion table in rrConfig. 

To test it, please login at the client (The client with this error). Start rrControl (or rrWebsite). Select the job and open the "Job Info" tab. Click on the output file. It will open the folder where the rendering should be saved into.


Note: If you are running the client as service, then it can be that you have to login with the User for rrService. The one you have used in rrWorkstationinstaller to install the service.


You probably have to change the output path in your scene file to a network path that can be found and accessed by all clients.


rrServer:

If the rrServer can not reach the output directory, then the rrServer logs this error for the job. The rrServer checks again in 120 seconds or if a client returns the job (what happens earlier). If this happens 3 times, the job is disabled and will not be send anymore.

Job send 3 times more than frames exist


No frames found


The server has send the job multiple times to clients.

But there is still nothing or not all frames rendered or the server does not find the frames (if there are files rendered).


Reasons for this issue:


  • The job was send with a wrong output path that does not match the output path in the scene.
    You find new frames, but the server does not find them.
  • There is an issue with the scene/renderer. The renderer is started but does not output any frames. Perhaps there have been some frames rendered, but some simply won't render. 
  • As you find new frames, but the server does not, you simply have to compare the image path+name+extension of the frames you find and compare it with the output name of the job. Usally jobs are send with the help of Submitter Plugins or Scene Parsers. These should automatically set the output path at the rrSubmitter for you. If you had to manually write the output path, then you need to set it right next time. Or use a Submitter Plugin if available. (check the help section Renderer/YourRenderer)
  • Please check the last render log files of the job.
    Is there any information that the renderer has written a frame?
    Is there any information that the renderer completed its task? 
  • If it is not possible to set the right output path, then you can enable "Do not check for frames". (not recommended)

A frame was send more than 25 times. Frame removed from job.

The rrServer has send a frame to clients 25 times. 

And the frame is still not rendered.

Therefore the rrServer skips this frame until you reset the job.

Please check the render log file list of the job. (see next section).


If it is on purpose that the job does not render frames, you can enable the job option "Ignore Crashes".


Render application not found

A client got a job, but the client does not know where to find the render application executable.

Please see help section Troubleshooting/Submission


 



II. Render Log files



  1. Open the Job Log page in rrControl.


  2. You have a list of all render log files on the left side.
    They are shown as 
    [Date Time]  [first frame] [client name] [relative size]
    The logfiles can have colors:
    Black        Default 
    Green        The client is still rendering
    Orange        The job was aborted at the client
    Red                The render application crashes, the client found an error message in the log or there was a license problem.
    Violet        The client executed a RR Pre/Post/Done-script




Structure of the render log file



A

++++++++++++++++++++++++ Log Start +++++++++++++++

DAYWALKER033  rrClient console L     8.0.beta2b    rrVer8.0.alpha17

08.22 03:18.09

...

...

...

This part shows some information about

the client

the render range

the executable used to render

the scene file that was redered

B

++++++++++++ Environment Variables Job: +++++++++++++++++

...

...

...

++++++++++++ Environment Variables Client: +++++++++++++++

...

...

...

++++++++++++ Environment Variables Done  +++++++++++++++++++

Some Environment variables with all kind of information about the job and the client.

C

call "\RRender6\render_apps\_setenv\win\Softimage.bat"

...

...

...


Now the commands of the render config files are executed.

Usually the first command is to call the global and render app specific setenv file.

This setenv file sets the path to plugins, license servers, can define options, ...

D

++++++++++++++++++++++ START +++++++++++++++++++++++

C:\Program Files\Autodesk\Maya\bin\Maya scene start end params

...

15:33.14 rrMaya - SET: defaultRenderGlobals.imageFilePrefix = MyImageOutput

15:33.14 rrMaya - SET: defaultRenderGlobals.animation = True

15:33.14 rrMaya - SET: defaultRenderGlobals.modifyExtension = 0

...

...

...

15:33.15 rrMaya      : Scene load time: 0:00:07.125000  h:m:s.ms

15:33.15 rrMaya      : Scene init done, starting to render... 

The render application is started with all required commandline parameters.

Some render scripts will print what they change in the scene file before it starts to render.


E

15:33.23 rrMaya      : Starting to render frame #1 ...

15:33.23 rrKSO - ---: C:/RR_localdata/renderout/A/3D/Maya/images/MUM_ANIM_MASTER1.1.exr

...

SCEN 0.2     22 MB info : wallclock  0:00:00.00 for scene preprocessing

RC   0.2     23 MB info : option: scanline          off

RC   0.2     23 MB info : option: trace             on

...

RC   0.2     36 MB info : wallclock  0:00:02.49 for rendering

...

15:33.27 rrMaya      : Frame #1 done. Frame Time: 0:00:04.569000  h:m:s.ms

15:33.27 rrMaya      : rrKSO Frame(s) done #1

...

15:37.29 rrMaya      : Render done

Some RR render scripts report which frame they start to render. 

Then the renderer outputs some information.

And at last the RR script reports the time a frame took to process.

F

++++++++++++++++++++++++++++++++++++++++++++++++++++++

++++++++++++++++++++++ Render Executable done ++++++++++++++++

++++++++++++++ Royal Render checks the return code of the executable +++++

++++++++++++++++++++++ 05.26 14:13.53 +++++++++++++++++++++++

++++ Executable returned -1073741819 (0x ffffffffc0000005) as exit code for frame 65.

++++ Render crashed

+++++++++++++++++++++++++++++++++++++++++++++++++


After the render application finished, the tool rrCheckexitcode checks for the returned error code of the application.

If an application crashed, then it returns an error code. An application should never crash, it should always print an error message and exit. But as all applications have bugs, a crash can happen.

Most of the time it is not possible to convert these error codes into actual error messages. You can try to take a look at Windows/Linux system error codes, but they probably won't help you.


Note: It happens that the application rendered the frames successfully, but crashed afterward. For example while unloading a plugin or releasing memory.

G

Current time:08.17 15:37.37


Total processing time: 00:01.34  (this KSO segment)

Max memory usage: 0.85GB  (this KSO segment)

Max memory usage: 0.85GB  (all KSO segments) 

Max CPU usage: 2.3cores  29% (of 8 cores) (last KSO segment)

Max CPU usage: 2.5cores  32% (of 8 cores)

After all processes closes, the rrClient adds the current time and some other informations to the log file.


H

************************* LOCAL RENDER INFORMATION *************************

Total render files found:     7 (all KSO segments)

Total render files found:     4 (this KSO segment)

Files successful copied:      7

Files broken/not copied:      0

Total render files found (main pass only):     4

List of files: (C=copied    R=successfully read    X=Exr cropped    F=image verify failed    V=rrViewer cache created)

    6 C R V: \Maya\images\MUM_ANIM_MASTER1.6.exr

    7 C R V: \Maya\images\MUM_ANIM_MASTER1.7.exr

    8 C R V: \Maya\images\MUM_ANIM_MASTER1.8.exr

    9 C R V: \Maya\images\MUM_ANIM_MASTER1.9.exr

If you have enabled local render out, then you can see which frames have been rendered during this frame segment. As well as if the frames have been valid (R=successfully read).

And if the frame was copied (C) to the fileserver. If it was not copied, then it was usually a placeholder file which was created before the render started.

I

*********** CLIENT REPORTS DURING THIS RENDER **************

If the client has collected some information during render, then it will append the information at the end of the log file.

It collects for example error messages that it has found in the log output.

E - I

       "

If you have enabled KSO (Keep Scene Open), the client might get a new segment to render. And it will output log sections E to I again for each segment.


       

Find issues












III.- Scene Breakdown


This part is a bit more work, but sometimes you have scene that simply do not wont to render. Or you get noise, or ....


You can change thousands of settings in a full scene with hundrets of objects, shaders, lights, textures, plugins.

I personally see it every time, a lot of artists are trying to find an issue inside a production scene and change this value, rendering, then that, rendering,...

But do you have time to change everything? Often the reason can be one single object. So how long until you find this needle in a haystack?


So the idea of the scene breakdown is to reduce the possibilites. Exclude possible reasons very fast.

The workflow is to change something (save the scene with a meaningful name what you have changed) and submit it to the farm.

While the scene is on the farm, you can probably do the next test scene and also submit it.

You could end up with 10 scenes at the same time rendering on the farm to see which one works.


The following list is a collection of steps you can try. Depending on the type of render issue you get.

Make this test render fast!

Set the Antialiasing to a low setting (MRay: -2 -2) or reduce the image resolution.

Reduce area light samples.

Reduce the time range.

Make a small test scene first.

Try to delete all gemoetry and just place a cube in it. If it renders, you know it is something in the scene.

If not, try a brand new scene with a cube only.

Make the test render fast/

Is it one of these render features?

Disable displacement in your render settings.

Disable shadows.

Disable volume rendering.

Remove all textures or replace all texture connections with a single 256x256 pixel image.

Is it a shader?

Remove all shaders and textures from the scene. Apply a lambert only.

Is it the geometry?

Is it a light?

Is the scene itself corrupt?/Is it a render setting?

Remove all Objects and place an animated sphere/cube into the scene.

Same for the lights, remove all lights and create one directional light only.

With only one light and one object in the scene you have not much left to test. 

Perhaps the scene is corrupt? Try to create a new scene and merge your old into that one.

Is it one of my objects?

Which one?

Remove half of your objects. Save it as "_halfA". Open the orginal scene, remove the other half and save it as "_halfB".

If one of these scenes render, it could be one of the objects left.

Repeat the half-splitting ("_halfAA"+"_halfAB", "_halfABA"+"_halfABB"...) until you have the object.

Note: it could be the geometry, the shader or a texture applied to this object.

You know it is one of the shader networks.

So test single features of the shading network.

Create multiple scenes with only single parts of the shading network. Again, create multiple test scenes you can send to the farm at the same time.

You have two ways.

Either remove parts of the shading network one by one and send them to render:

no displacement

+no bump mapping

+no reflection

...

OR

exclusively test single parts of the shading network:

- Main shader only (no textures)

- Main shader + Displacement

- Main shader + Diffuse texture

- Main shader + Reflection

- Main shader + Ambient Occlusion

- Main shader + Bump mapping

...


       

IV. Time Debugging

You have a scene that does render really slow. And you think it should not take that long. (e.g. local render 3min, farm render 20min)


These are some known issues and tips for debugging:

External files - network traffic

It happens often that your fileserver is able to handle artists working on a scene with many file inputs.

But once the scene is send to 10 (30?) clients at a time the network is slowed down.

Try to send the scene to 1 client only. Disable (most of) the other clients that they do not get any job.

Simulation cache files - read only

If you use simulations in your scene, please check that the simulation is not re-simulated.

Some simulators allow to set an option to read cache files only, no re-simulation.

If you are not sure if a setting has this effect, then close the scene, rename the cache files folder and re-open the scene.

If the scene does not simulate, you do not see anything of your simulation.

If you think that the render time should be about the same for all frames, but according to the logs the total frame time increases the higher the frame number send to a client, then this is an indication that the scene is re-simulated.

Simulation cache files - relative paths

If you use relative paths to your cache files (e.g. inside the Maya project folder), then you have to disable "local scene cache" for the job.

Reduce/Remove everything that takes render time

If you remove everything beside the issue you will reduce the chance that you get a false assumption about the issue.

Reduce render resolution and antialiasing.

Breakdown/Slim the scene. See "Scene breakdown" how to keep the object responsible for the render time. 

Render log time/CPU informations

RR and renderer have a few helpful prints in the render log.

For example a default RR+ Maya + Vray render:

++ Log Start +++

10.01 18:15.57

[2013/Oct/1|18:16:03] V-Ray: V-Ray core version is 2.00.01

File read in 4 seconds.

Result: C:/RR_localdata/cachedscenes/0257_0970_lighting_v01_004.mb

V-Ray: Starting render

[2013/Oct/1|18:16:35]

[2013/Oct/1|19:16:54] V-Ray: Rendering.

[2013/Oct/1|19:16:54] V-Ray: Pre-render export.

[2013/Oct/1|19:16:54] V-Ray: Pre-render export

[2013/Oct/1|20:12:35] V-Ray: Pre-render export done.

[2013/Oct/1|20:12:35] V-Ray: Rendering frames.

+++ Render Executable done ++

++ 10.01. 20:14.34 +++

Max core usage of job: 24.00cores (of 24 cores)

Startup to first frame: 59.28 m.s (render app start + scene load + 1st frame preprocessing)


Analysis:

Max CPU usage of the job is 24 of 24, so there is no thread limitation.

Startup to first frame is 59min. So it took a long time until the frame was started. If I compare it to the VRay pre-render export times, then I can see that there is an hour difference. Which confirms the long pre-processing time.

Add time prints

Add as many time prints as possible.

For example as pre/post render scripts.

If you use simulations, add a pre and post-frame script to all sim objects.

Add objects to your scene with a expression/script driven setting (e.g. position noise script) . Then you know when the scene/frame is being prepared after scene load.

Increase renderer verbose

Increase the renderer verbose output.

(Note: If you increase it to much, the log itselfs slows down the rendering. E.g. one log line takes 0.01s. Not much, but if you print 1000 log lines it takes 10s.)

Local workstation/artist settings/caches.

Sometimes a scene is fast for the artist because of local caches.

So it could be different on a different machine or even with a different user.

Please login with the same user that you use for rendering (see render log: ++ Log Start +++ Environment user is renderadmin)

on a render farm machine.

Open the scene and render the first frame. (Should not take long as you have decreased the AA and the resolution).

Then render the last frame of the scene.

How to send debug jobs to the farm

According to the "External files - network traffic" you should not send one scene to many clients at the same time.

If you get a fast render with a few clients and a slow render with many clients (same frames), then your network reached a limit. (Hint: Read rrInstallation section about Jumbo Frames)


You can set sequence divide min/max to:

  • 1/1: You get the full scene load and preprocessing time for every frame.
  • 3/3: You get once the scene load, but multiple times the frame proprocessing and render times.
    If you compare it to 1/1 you can calculate the scene load time.