You may need to look at the fanout of the U9357 output. It may be that there is a large load due to a large fanout that is causing the large delay. Do you have a max_fanout constraint set? If not try adding this constraint to limit the number of gates each output drives. Also make sure you have set the max_transition constraint as well. The input transition on this device looks a little slow as well.
You should add the max_fanout and max_transition constraint in both the synthesis constrains file and the Encounter constraints file. The best values to use for each constraint depend on the capabilities of the technology and the requirements of the design.
To determine a good max_transition, you can look in the standard cell library .lib file and look at the transition timing tables for the clock inputs of flops and the inputs of buffers and inverters. Transition values near the middle of these table should give you a good idea of what typical transitions can be expected from the technology.
For max_fanout, pick a reasonable number that still produces good transition times.
The max_fanout and max_transition should be defined in the encounter constraint file which should be a standard .sdc file.