daily pastebin goal
78%
SHARE
TWEET

Untitled

a guest Jun 5th, 2017 516 Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
  1. commit 8cc7a88e3430e3dbc0a5c873c467ee9fed729cfd
  2. Author: Akash Goel <akash.goel@intel.com>
  3. Date:   Mon Dec 28 13:56:18 2015 +0530
  4.  
  5.     drm/i915: Support to enable TRTT on GEN9
  6.    
  7.     Gen9 has an additional address translation hardware support in form of
  8.     Tiled Resource Translation Table (TR-TT) which provides an extra level
  9.     of abstraction over PPGTT.
  10.     This is useful for mapping Sparse/Tiled texture resources.
  11.     Sparse resources are created as virtual-only allocations. Regions of the
  12.     resource that the application intends to use is bound to the physical memory
  13.     on the fly and can be re-bound to different memory allocations over the
  14.     lifetime of the resource.
  15.    
  16.     TR-TT is tightly coupled with PPGTT, a new instance of TR-TT will be required
  17.     for a new PPGTT instance, but TR-TT may not enabled for every context.
  18.     1/16th of the 48bit PPGTT space is earmarked for the translation by TR-TT,
  19.     which such chunk to use is conveyed to HW through a register.
  20.     Any GFX address, which lies in that reserved 44 bit range will be translated
  21.     through TR-TT first and then through PPGTT to get the actual physical address,
  22.     so the output of translation from TR-TT will be a PPGTT offset.
  23.    
  24.     TRTT is constructed as a 3 level tile Table. Each tile is 64KB is size which
  25.     leaves behind 44-16=28 address bits. 28bits are partitioned as 9+9+10, and
  26.     each level is contained within a 4KB page hence L3 and L2 is composed of
  27.     512 64b entries and L1 is composed of 1024 32b entries.
  28.    
  29.     There is a provision to keep TR-TT Tables in virtual space, where the pages of
  30.     TRTT tables will be mapped to PPGTT.
  31.     Currently this is the supported mode, in this mode UMD will have a full control
  32.     on TR-TT management, with bare minimum support from KMD.
  33.     So the entries of L3 table will contain the PPGTT offset of L2 Table pages,
  34.     similarly entries of L2 table will contain the PPGTT offset of L1 Table pages.
  35.     The entries of L1 table will contain the PPGTT offset of BOs actually backing
  36.     the Sparse resources.
  37.     UMD will have to allocate the L3/L2/L1 table pages as a regular BO only &
  38.     assign them a PPGTT address through the Soft Pin API (for example, use soft pin
  39.     to assign l3_table_address to the L3 table BO, when used).
  40.     UMD will also program the entries in the TR-TT page tables using regular batch
  41.     commands (MI_STORE_DATA_IMM), or via mmapping of the page table BOs.
  42.     UMD may do the complete PPGTT address space management, on the pretext that it
  43.     could help minimize the conflicts.
  44.    
  45.     Any space in TR-TT segment not bound to any Sparse texture, will be handled
  46.     through Invalid tile, User is expected to initialize the entries of a new
  47.     L3/L2/L1 table page with the Invalid tile pattern. The entries corresponding to
  48.     the holes in the Sparse texture resource will be set with the Null tile pattern
  49.     The improper programming of TRTT should only lead to a recoverable GPU hang,
  50.     eventually leading to banning of the culprit context without victimizing others.
  51.    
  52.     The association of any Sparse resource with the BOs will be known only to UMD,
  53.     and only the Sparse resources shall be assigned an offset from the TR-TT segment
  54.     by UMD. The use of TR-TT segment or mapping of Sparse resources will be
  55.     transparent to the KMD, UMD will do the address assignment from TR-TT segment
  56.     autonomously and KMD will be oblivious of it.
  57.     Any object must not be assigned an address from TR-TT segment, they will be
  58.     mapped to PPGTT in a regular way by KMD.
  59.    
  60.     This patch provides an interface through which UMD can convey KMD to enable
  61.     TR-TT for a given context. A new I915_CONTEXT_PARAM_TRTT param has been
  62.     added to I915_GEM_CONTEXT_SETPARAM ioctl for that purpose.
  63.     UMD will have to pass the GFX address of L3 table page, start location of TR-TT
  64.     segment alongwith the pattern value for the Null & invalid Tile registers.
  65.    
  66.     v2:
  67.      - Support context_getparam for TRTT also and dispense with a separate
  68.        GETPARAM case for TRTT (Chris).
  69.      - Use i915_dbg to log errors for the invalid TRTT ABI parameters passed
  70.        from user space (Chris).
  71.      - Move all the argument checking for TRTT in context_setparam to the
  72.        set_trtt function (Chris).
  73.      - Change the type of 'flags' field inside 'intel_context' to unsigned (Chris)
  74.      - Rename certain functions to rightly reflect their purpose, rename
  75.        the new param for TRTT in gem_context_param to I915_CONTEXT_PARAM_TRTT,
  76.        rephrase few lines in the commit message body, add more comments (Chris).
  77.      - Extend ABI to allow User specify TRTT segment location also.
  78.      - Fix for selective enabling of TRTT on per context basis, explicitly
  79.        disable TR-TT at the start of a new context.
  80.    
  81.     v3:
  82.      - Check the return value of gen9_emit_trtt_regs (Chris)
  83.      - Update the kernel doc for intel_context structure.
  84.      - Rebased.
  85.    
  86.     v4:
  87.      - Fix the warnings reported by 'checkpatch.pl --strict' (Michel)
  88.      - Fix the context_getparam implementation avoiding the reset of size field,
  89.        affecting the TRTT case.
  90.    
  91.     v5:
  92.      - Update the TR-TT params right away in context_setparam, by constructing
  93.        & submitting a request emitting LRIs, instead of deferring it and
  94.        conflating with the next batch submission (Chris)
  95.      - Follow the struct_mutex handling related prescribed rules, while accessing
  96.        User space buffer, both in context_setparam & getparam functions (Chris).
  97.    
  98.     v6:
  99.      - Fix the warning caused due to removal of un-allocated trtt vma node.
  100.    
  101.     v7:
  102.      - Move context ref/unref to context_setparam_ioctl from set_trtt() & remove
  103.        that from get_trtt() as not really needed there (Chris).
  104.      - Add a check for improper values for Null & Invalid Tiles.
  105.      - Remove superfluous DRM_ERROR from trtt_context_allocate_vma (Chris).
  106.      - Rebased.
  107.    
  108.     v8:
  109.      - Add context ref/unref to context_getparam_ioctl also so as to be consistent
  110.        and ease the extension of ioctl in future (Chris)
  111.    
  112.     v9:
  113.      - Fix the handling of return value from trtt_context_allocate_vma() function,
  114.        causing kernel panic at the time of destroying context, in case of
  115.        unsuccessful allocation of trtt vma.
  116.      - Rebased.
  117.    
  118.     v10:
  119.      - Rebased.
  120.    
  121.     v11:
  122.      - Rebased (intel_ring_emit gone).
  123.    
  124.     v12:
  125.      - Rebased (i915_add_request_no_flush gone).
  126.    
  127.     Testcase: igt/gem_trtt
  128.    
  129.     Cc: Chris Wilson <chris@chris-wilson.co.uk>
  130.     Cc: Michel Thierry <michel.thierry@intel.com>
  131.     Signed-off-by: Akash Goel <akash.goel@intel.com>
  132.     Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
  133.     Signed-off-by: Michel Thierry <michel.thierry@intel.com>
  134.  
  135. diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
  136. index 37c3522e9389..b15e5d9ad243 100644
  137. --- a/drivers/gpu/drm/i915/i915_drv.h
  138. +++ b/drivers/gpu/drm/i915/i915_drv.h
  139. @@ -775,6 +775,7 @@ struct intel_csr {
  140.     func(has_resource_streamer); \
  141.     func(has_runtime_pm); \
  142.     func(has_snoop); \
  143. +   func(has_trtt); \
  144.     func(unfenced_needs_alignment); \
  145.     func(cursor_needs_physical); \
  146.     func(hws_needs_physical); \
  147. @@ -2986,6 +2987,8 @@ intel_info(const struct drm_i915_private *dev_priv)
  148.  
  149.  #define HAS_POOLED_EU(dev_priv)    ((dev_priv)->info.has_pooled_eu)
  150.  
  151. +#define HAS_TRTT(dev)      (INTEL_INFO(dev)->has_trtt)
  152. +
  153.  #define INTEL_PCH_DEVICE_ID_MASK       0xff00
  154.  #define INTEL_PCH_DEVICE_ID_MASK_EXT       0xff80
  155.  #define INTEL_PCH_IBX_DEVICE_ID_TYPE       0x3b00
  156. diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
  157. index a56e79430082..3afaeaf19b95 100644
  158. --- a/drivers/gpu/drm/i915/i915_gem_context.c
  159. +++ b/drivers/gpu/drm/i915/i915_gem_context.c
  160. @@ -92,6 +92,14 @@
  161.  
  162.  #define ALL_L3_SLICES(dev) (1 << NUM_L3_SLICES(dev)) - 1
  163.  
  164. +static void intel_context_free_trtt(struct i915_gem_context *ctx)
  165. +{
  166. +   if (!ctx->trtt_info.vma)
  167. +       return;
  168. +
  169. +   intel_trtt_context_destroy_vma(ctx->trtt_info.vma);
  170. +}
  171. +
  172.  void i915_gem_context_free(struct kref *ctx_ref)
  173.  {
  174.     struct i915_gem_context *ctx = container_of(ctx_ref, typeof(*ctx), ref);
  175. @@ -101,6 +109,7 @@ void i915_gem_context_free(struct kref *ctx_ref)
  176.     trace_i915_context_free(ctx);
  177.     GEM_BUG_ON(!i915_gem_context_is_closed(ctx));
  178.  
  179. +   intel_context_free_trtt(ctx);
  180.     i915_ppgtt_put(ctx->ppgtt);
  181.  
  182.     for (i = 0; i < I915_NUM_ENGINES; i++) {
  183. @@ -564,6 +573,129 @@ void i915_gem_context_close(struct drm_device *dev, struct drm_file *file)
  184.     idr_destroy(&file_priv->context_idr);
  185.  }
  186.  
  187. +static int
  188. +intel_context_get_trtt(struct i915_gem_context *ctx,
  189. +              struct drm_i915_gem_context_param *args)
  190. +{
  191. +   struct drm_i915_gem_context_trtt_param trtt_params;
  192. +   struct drm_i915_private *dev_priv = ctx->i915;
  193. +
  194. +   if (!HAS_TRTT(dev_priv) || !USES_FULL_48BIT_PPGTT(dev_priv)) {
  195. +       return -ENODEV;
  196. +   } else if (args->size < sizeof(trtt_params)) {
  197. +       args->size = sizeof(trtt_params);
  198. +   } else {
  199. +       trtt_params.segment_base_addr =
  200. +           ctx->trtt_info.segment_base_addr;
  201. +       trtt_params.l3_table_address =
  202. +           ctx->trtt_info.l3_table_address;
  203. +       trtt_params.null_tile_val =
  204. +           ctx->trtt_info.null_tile_val;
  205. +       trtt_params.invd_tile_val =
  206. +           ctx->trtt_info.invd_tile_val;
  207. +
  208. +       mutex_unlock(&dev_priv->drm.struct_mutex);
  209. +
  210. +       if (__copy_to_user(u64_to_user_ptr(args->value),
  211. +                  &trtt_params,
  212. +                  sizeof(trtt_params))) {
  213. +           mutex_lock(&dev_priv->drm.struct_mutex);
  214. +           return -EFAULT;
  215. +       }
  216. +
  217. +       args->size = sizeof(trtt_params);
  218. +       mutex_lock(&dev_priv->drm.struct_mutex);
  219. +   }
  220. +
  221. +   return 0;
  222. +}
  223. +
  224. +static int
  225. +intel_context_set_trtt(struct i915_gem_context *ctx,
  226. +              struct drm_i915_gem_context_param *args)
  227. +{
  228. +   struct drm_i915_gem_context_trtt_param trtt_params;
  229. +   struct i915_vma *vma;
  230. +   struct drm_i915_private *dev_priv = ctx->i915;
  231. +   int ret;
  232. +
  233. +   if (!HAS_TRTT(dev_priv) || !USES_FULL_48BIT_PPGTT(dev_priv))
  234. +       return -ENODEV;
  235. +   else if (i915_gem_context_use_trtt(ctx))
  236. +       return -EEXIST;
  237. +   else if (args->size < sizeof(trtt_params))
  238. +       return -EINVAL;
  239. +
  240. +   mutex_unlock(&dev_priv->drm.struct_mutex);
  241. +
  242. +   if (copy_from_user(&trtt_params,
  243. +              u64_to_user_ptr(args->value),
  244. +              sizeof(trtt_params))) {
  245. +       mutex_lock(&dev_priv->drm.struct_mutex);
  246. +       ret = -EFAULT;
  247. +       goto exit;
  248. +   }
  249. +
  250. +   mutex_lock(&dev_priv->drm.struct_mutex);
  251. +
  252. +   /* Check if the setup happened from another path */
  253. +   if (i915_gem_context_use_trtt(ctx)) {
  254. +       ret = -EEXIST;
  255. +       goto exit;
  256. +   }
  257. +
  258. +   /* basic sanity checks for the segment location & l3 table pointer */
  259. +   if (trtt_params.segment_base_addr & (GEN9_TRTT_SEGMENT_SIZE - 1)) {
  260. +       DRM_DEBUG_DRIVER("segment base address not correctly aligned\n");
  261. +       ret = -EINVAL;
  262. +       goto exit;
  263. +   }
  264. +
  265. +   if (((trtt_params.l3_table_address + PAGE_SIZE) >=
  266. +         trtt_params.segment_base_addr) &&
  267. +        (trtt_params.l3_table_address <
  268. +         (trtt_params.segment_base_addr + GEN9_TRTT_SEGMENT_SIZE))) {
  269. +       DRM_DEBUG_DRIVER("l3 table address conflicts with trtt segment\n");
  270. +       ret = -EINVAL;
  271. +       goto exit;
  272. +   }
  273. +
  274. +   if (trtt_params.l3_table_address & ~GEN9_TRTT_L3_GFXADDR_MASK) {
  275. +       DRM_DEBUG_DRIVER("invalid l3 table address\n");
  276. +       ret = -EINVAL;
  277. +       goto exit;
  278. +   }
  279. +
  280. +   if (trtt_params.null_tile_val == trtt_params.invd_tile_val) {
  281. +       DRM_DEBUG_DRIVER("incorrect values for null & invalid tiles\n");
  282. +       return -EINVAL;
  283. +   }
  284. +
  285. +   vma = intel_trtt_context_allocate_vma(&ctx->ppgtt->base,
  286. +                         trtt_params.segment_base_addr);
  287. +   if (IS_ERR(vma)) {
  288. +       ret = PTR_ERR(vma);
  289. +       goto exit;
  290. +   }
  291. +
  292. +   ctx->trtt_info.vma = vma;
  293. +   ctx->trtt_info.null_tile_val = trtt_params.null_tile_val;
  294. +   ctx->trtt_info.invd_tile_val = trtt_params.invd_tile_val;
  295. +   ctx->trtt_info.l3_table_address = trtt_params.l3_table_address;
  296. +   ctx->trtt_info.segment_base_addr = trtt_params.segment_base_addr;
  297. +
  298. +   ret = intel_lr_rcs_context_setup_trtt(ctx);
  299. +   if (ret) {
  300. +       intel_trtt_context_destroy_vma(ctx->trtt_info.vma);
  301. +       goto exit;
  302. +   }
  303. +
  304. +   i915_gem_context_set_trtt(ctx);
  305. +
  306. +exit:
  307. +   return ret;
  308. +}
  309. +
  310.  static inline int
  311.  mi_set_context(struct drm_i915_gem_request *req, u32 flags)
  312.  {
  313. @@ -1024,7 +1156,14 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
  314.         return PTR_ERR(ctx);
  315.     }
  316.  
  317. -   args->size = 0;
  318. +   /*
  319. +    * Take a reference also, as in certain cases we have to release &
  320. +    * reacquire the struct_mutex and we don't want the context to
  321. +    * go away.
  322. +    */
  323. +   i915_gem_context_get(ctx);
  324. +
  325. +   args->size = (args->param != I915_CONTEXT_PARAM_TRTT) ? 0 : args->size;
  326.     switch (args->param) {
  327.     case I915_CONTEXT_PARAM_BAN_PERIOD:
  328.         ret = -EINVAL;
  329. @@ -1049,10 +1188,14 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
  330.     case I915_CONTEXT_PARAM_WATCHDOG:
  331.         ret = i915_gem_context_get_watchdog(ctx, args);
  332.         break;
  333. +   case I915_CONTEXT_PARAM_TRTT:
  334. +       ret = intel_context_get_trtt(ctx, args);
  335. +       break;
  336.     default:
  337.         ret = -EINVAL;
  338.         break;
  339.     }
  340. +   i915_gem_context_put(ctx);
  341.     mutex_unlock(&dev->struct_mutex);
  342.  
  343.     return ret;
  344. @@ -1076,6 +1219,13 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
  345.         return PTR_ERR(ctx);
  346.     }
  347.  
  348. +   /*
  349. +    * Take a reference also, as in certain cases we have to release &
  350. +    * reacquire the struct_mutex and we don't want the context to
  351. +    * go away.
  352. +    */
  353. +   i915_gem_context_get(ctx);
  354. +
  355.     switch (args->param) {
  356.     case I915_CONTEXT_PARAM_BAN_PERIOD:
  357.         ret = -EINVAL;
  358. @@ -1109,10 +1259,14 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
  359.     case I915_CONTEXT_PARAM_WATCHDOG:
  360.         ret = i915_gem_context_set_watchdog(ctx, args);
  361.         break;
  362. +   case I915_CONTEXT_PARAM_TRTT:
  363. +       ret = intel_context_set_trtt(ctx, args);
  364. +       break;
  365.     default:
  366.         ret = -EINVAL;
  367.         break;
  368.     }
  369. +   i915_gem_context_put(ctx);
  370.     mutex_unlock(&dev->struct_mutex);
  371.  
  372.     return ret;
  373. diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/i915_gem_context.h
  374. index 88700bdbb4e1..737660efce5c 100644
  375. --- a/drivers/gpu/drm/i915/i915_gem_context.h
  376. +++ b/drivers/gpu/drm/i915/i915_gem_context.h
  377. @@ -108,6 +108,7 @@ struct i915_gem_context {
  378.  #define CONTEXT_BANNABLE       3
  379.  #define CONTEXT_BANNED         4
  380.  #define CONTEXT_FORCE_SINGLE_SUBMISSION    5
  381. +#define CONTEXT_USE_TRTT       6
  382.  
  383.     /**
  384.      * @hw_id: - unique identifier for the context
  385. @@ -157,6 +158,18 @@ struct i915_gem_context {
  386.         bool initialised;
  387.     } engine[I915_NUM_ENGINES];
  388.  
  389. +   /**
  390. +    * @trtt_info: Programming parameters for tr-tt (redirection tables
  391. +    * for userspace, for sparse resource management).
  392. +    */
  393. +   struct intel_context_trtt {
  394. +       u32 invd_tile_val;
  395. +       u32 null_tile_val;
  396. +       u64 l3_table_address;
  397. +       u64 segment_base_addr;
  398. +       struct i915_vma *vma;
  399. +   } trtt_info;
  400. +
  401.     /** ring_size: size for allocating the per-engine ring buffer */
  402.     u32 ring_size;
  403.     /** desc_template: invariant fields for the HW context descriptor */
  404. @@ -240,6 +253,16 @@ static inline void i915_gem_context_set_force_single_submission(struct i915_gem_
  405.     __set_bit(CONTEXT_FORCE_SINGLE_SUBMISSION, &ctx->flags);
  406.  }
  407.  
  408. +static inline bool i915_gem_context_use_trtt(const struct i915_gem_context *ctx)
  409. +{
  410. +   return test_bit(CONTEXT_USE_TRTT, &ctx->flags);
  411. +}
  412. +
  413. +static inline void i915_gem_context_set_trtt(struct i915_gem_context *ctx)
  414. +{
  415. +   __set_bit(CONTEXT_USE_TRTT, &ctx->flags);
  416. +}
  417. +
  418.  static inline bool i915_gem_context_is_default(const struct i915_gem_context *c)
  419.  {
  420.     return c->user_handle == DEFAULT_CONTEXT_HANDLE;
  421. diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
  422. index 4ff854e6413c..cfbc1c47aab8 100644
  423. --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
  424. +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
  425. @@ -1899,6 +1899,16 @@ int i915_ppgtt_init_hw(struct drm_i915_private *dev_priv)
  426.  {
  427.     gtt_write_workarounds(dev_priv);
  428.  
  429. +   if (HAS_TRTT(dev_priv) && USES_FULL_48BIT_PPGTT(dev_priv)) {
  430. +       /*
  431. +        * Globally enable TR-TT support in Hw.
  432. +        * Still TR-TT enabling on per context basis is required.
  433. +        * Non-trtt contexts are not affected by this setting.
  434. +        */
  435. +       I915_WRITE(GEN9_TR_CHICKEN_BIT_VECTOR,
  436. +              GEN9_TRTT_BYPASS_DISABLE);
  437. +   }
  438. +
  439.     /* In the case of execlists, PPGTT is enabled by the context descriptor
  440.      * and the PDPs are contained within the context itself.  We don't
  441.      * need to do anything here. */
  442. @@ -3171,6 +3181,56 @@ void i915_gem_restore_gtt_mappings(struct drm_i915_private *dev_priv)
  443.     i915_ggtt_invalidate(dev_priv);
  444.  }
  445.  
  446. +void intel_trtt_context_destroy_vma(struct i915_vma *vma)
  447. +{
  448. +   WARN_ON(!list_empty(&vma->obj_link));
  449. +   WARN_ON(!list_empty(&vma->vm_link));
  450. +   WARN_ON(!list_empty(&vma->exec_list));
  451. +
  452. +   WARN_ON(!i915_vma_is_pinned(vma));
  453. +
  454. +   if (drm_mm_node_allocated(&vma->node))
  455. +       drm_mm_remove_node(&vma->node);
  456. +
  457. +   i915_ppgtt_put(i915_vm_to_ppgtt(vma->vm));
  458. +   kmem_cache_free(vma->vm->i915->vmas, vma);
  459. +}
  460. +
  461. +struct i915_vma *
  462. +intel_trtt_context_allocate_vma(struct i915_address_space *vm,
  463. +               uint64_t segment_base_addr)
  464. +{
  465. +   struct i915_vma *vma;
  466. +   int ret;
  467. +
  468. +   GEM_BUG_ON(vm->closed);
  469. +
  470. +   vma = kmem_cache_zalloc(vm->i915->vmas, GFP_KERNEL);
  471. +   if (!vma)
  472. +       return ERR_PTR(-ENOMEM);
  473. +
  474. +   INIT_LIST_HEAD(&vma->obj_link);
  475. +   INIT_LIST_HEAD(&vma->vm_link);
  476. +   INIT_LIST_HEAD(&vma->exec_list);
  477. +   vma->vm = vm;
  478. +   i915_ppgtt_get(i915_vm_to_ppgtt(vm));
  479. +
  480. +   /* Mark the vma as permanently pinned */
  481. +   __i915_vma_pin(vma);
  482. +
  483. +   /* Reserve from the 48 bit PPGTT space */
  484. +   vma->size = GEN9_TRTT_SEGMENT_SIZE;
  485. +   ret = i915_gem_gtt_reserve(vma->vm, &vma->node,
  486. +                  GEN9_TRTT_SEGMENT_SIZE, segment_base_addr,
  487. +                  vma->node.color, 0);
  488. +   if (ret) {
  489. +       intel_trtt_context_destroy_vma(vma);
  490. +       return ERR_PTR(ret);
  491. +   }
  492. +
  493. +   return vma;
  494. +}
  495. +
  496.  static struct scatterlist *
  497.  rotate_pages(const dma_addr_t *in, unsigned int offset,
  498.          unsigned int width, unsigned int height,
  499. diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
  500. index da9aa9f706e7..08b1875e6376 100644
  501. --- a/drivers/gpu/drm/i915/i915_gem_gtt.h
  502. +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
  503. @@ -143,6 +143,10 @@ typedef u64 gen8_ppgtt_pml4e_t;
  504.  #define GEN8_PPAT_ELLC_OVERRIDE        (0<<2)
  505.  #define GEN8_PPAT(i, x)            ((u64)(x) << ((i) * 8))
  506.  
  507. +/* Fixed size segment */
  508. +#define GEN9_TRTT_SEG_SIZE_SHIFT   44
  509. +#define GEN9_TRTT_SEGMENT_SIZE     (1ULL << GEN9_TRTT_SEG_SIZE_SHIFT)
  510. +
  511.  struct sg_table;
  512.  
  513.  struct intel_rotation_info {
  514. @@ -600,4 +604,8 @@ int i915_gem_gtt_insert(struct i915_address_space *vm,
  515.  #define PIN_OFFSET_FIXED   BIT(11)
  516.  #define PIN_OFFSET_MASK        (-I915_GTT_PAGE_SIZE)
  517.  
  518. +struct i915_vma *
  519. +intel_trtt_context_allocate_vma(struct i915_address_space *vm,
  520. +               uint64_t segment_base_addr);
  521. +void intel_trtt_context_destroy_vma(struct i915_vma *vma);
  522.  #endif
  523. diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
  524. index 537c1d56ecef..d51867c22b7a 100644
  525. --- a/drivers/gpu/drm/i915/i915_pci.c
  526. +++ b/drivers/gpu/drm/i915/i915_pci.c
  527. @@ -355,6 +355,7 @@ static const struct intel_device_info intel_skylake_info = {
  528.     .gen = 9,
  529.     .has_csr = 1,
  530.     .has_guc = 1,
  531. +   .has_trtt = 1,
  532.     .ddb_size = 896,
  533.  };
  534.  
  535. @@ -364,6 +365,7 @@ static const struct intel_device_info intel_skylake_gt3_info = {
  536.     .gen = 9,
  537.     .has_csr = 1,
  538.     .has_guc = 1,
  539. +   .has_trtt = 1,
  540.     .ddb_size = 896,
  541.     .ring_mask = RENDER_RING | BSD_RING | BLT_RING | VEBOX_RING | BSD2_RING,
  542.  };
  543. @@ -399,6 +401,7 @@ static const struct intel_device_info intel_broxton_info = {
  544.     GEN9_LP_FEATURES,
  545.     .platform = INTEL_BROXTON,
  546.     .ddb_size = 512,
  547. +   .has_trtt = 1,
  548.  };
  549.  
  550.  static const struct intel_device_info intel_geminilake_info = {
  551. diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
  552. index 1d35be1382b6..1085294f447f 100644
  553. --- a/drivers/gpu/drm/i915/i915_reg.h
  554. +++ b/drivers/gpu/drm/i915/i915_reg.h
  555. @@ -226,6 +226,25 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
  556.  #define   GEN8_RPCS_EU_MIN_SHIFT   0
  557.  #define   GEN8_RPCS_EU_MIN_MASK        (0xf << GEN8_RPCS_EU_MIN_SHIFT)
  558.  
  559. +#define GEN9_TR_CHICKEN_BIT_VECTOR _MMIO(0x4DFC)
  560. +#define   GEN9_TRTT_BYPASS_DISABLE (1 << 0)
  561. +
  562. +/* TRTT registers in the H/W Context */
  563. +#define GEN9_TRTT_L3_POINTER_DW0   _MMIO(0x4DE0)
  564. +#define GEN9_TRTT_L3_POINTER_DW1   _MMIO(0x4DE4)
  565. +#define   GEN9_TRTT_L3_GFXADDR_MASK    0xFFFFFFFF0000
  566. +
  567. +#define GEN9_TRTT_NULL_TILE_REG        _MMIO(0x4DE8)
  568. +#define GEN9_TRTT_INVD_TILE_REG        _MMIO(0x4DEC)
  569. +
  570. +#define GEN9_TRTT_VA_MASKDATA      _MMIO(0x4DF0)
  571. +#define   GEN9_TRVA_MASK_VALUE     0xF0
  572. +#define   GEN9_TRVA_DATA_MASK      0xF
  573. +
  574. +#define GEN9_TRTT_TABLE_CONTROL        _MMIO(0x4DF4)
  575. +#define   GEN9_TRTT_IN_GFX_VA_SPACE    (1 << 1)
  576. +#define   GEN9_TRTT_ENABLE     (1 << 0)
  577. +
  578.  #define GAM_ECOCHK         _MMIO(0x4090)
  579.  #define   BDW_DISABLE_HDC_INVALIDATION (1<<25)
  580.  #define   ECOCHK_SNB_BIT       (1<<10)
  581. diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
  582. index 5765b6d7da26..cd5d6b7fd285 100644
  583. --- a/drivers/gpu/drm/i915/intel_lrc.c
  584. +++ b/drivers/gpu/drm/i915/intel_lrc.c
  585. @@ -1733,6 +1733,91 @@ static int gen8_init_rcs_context(struct drm_i915_gem_request *req)
  586.     return i915_gem_render_state_emit(req);
  587.  }
  588.  
  589. +static int gen9_init_rcs_context_trtt(struct drm_i915_gem_request *req)
  590. +{
  591. +   u32 *cs;
  592. +
  593. +   cs = intel_ring_begin(req, 2 + 2);
  594. +   if (IS_ERR(cs))
  595. +       return PTR_ERR(cs);
  596. +
  597. +   *cs++ = MI_LOAD_REGISTER_IMM(1);
  598. +
  599. +   *cs++ = i915_mmio_reg_offset(GEN9_TRTT_TABLE_CONTROL);
  600. +   *cs++ = 0;
  601. +
  602. +   *cs++ = MI_NOOP;
  603. +   intel_ring_advance(req, cs);
  604. +
  605. +   return 0;
  606. +}
  607. +
  608. +static int gen9_init_rcs_context(struct drm_i915_gem_request *req)
  609. +{
  610. +   int ret;
  611. +
  612. +   /*
  613. +    * Explictily disable TR-TT at the start of a new context.
  614. +    * Otherwise on switching from a TR-TT context to a new Non TR-TT
  615. +    * context the TR-TT settings of the outgoing context could get
  616. +    * spilled on to the new incoming context as only the Ring Context
  617. +    * part is loaded on the first submission of a new context, due to
  618. +    * the setting of ENGINE_CTX_RESTORE_INHIBIT bit.
  619. +    */
  620. +   ret = gen9_init_rcs_context_trtt(req);
  621. +   if (ret)
  622. +       return ret;
  623. +
  624. +   return gen8_init_rcs_context(req);
  625. +}
  626. +
  627. +static int gen9_emit_trtt_regs(struct drm_i915_gem_request *req)
  628. +{
  629. +   struct i915_gem_context *ctx = req->ctx;
  630. +   u64 masked_l3_gfx_address =
  631. +       ctx->trtt_info.l3_table_address & GEN9_TRTT_L3_GFXADDR_MASK;
  632. +   u32 trva_data_value =
  633. +       (ctx->trtt_info.segment_base_addr >> GEN9_TRTT_SEG_SIZE_SHIFT) &
  634. +       GEN9_TRVA_DATA_MASK;
  635. +   const int num_lri_cmds = 6;
  636. +   u32 *cs;
  637. +
  638. +   /*
  639. +    * Emitting LRIs to update the TRTT registers is most reliable, instead
  640. +    * of directly updating the context image, as this will ensure that
  641. +    * update happens in a serialized manner for the context and also
  642. +    * lite-restore scenario will get handled.
  643. +    */
  644. +   cs = intel_ring_begin(req, num_lri_cmds * 2 + 2);
  645. +   if (IS_ERR(cs))
  646. +       return PTR_ERR(cs);
  647. +
  648. +   *cs++ = MI_LOAD_REGISTER_IMM(num_lri_cmds);
  649. +
  650. +   *cs++ = i915_mmio_reg_offset(GEN9_TRTT_L3_POINTER_DW0);
  651. +   *cs++ = lower_32_bits(masked_l3_gfx_address);
  652. +
  653. +   *cs++ = i915_mmio_reg_offset(GEN9_TRTT_L3_POINTER_DW1);
  654. +   *cs++ = upper_32_bits(masked_l3_gfx_address);
  655. +
  656. +   *cs++ = i915_mmio_reg_offset(GEN9_TRTT_NULL_TILE_REG);
  657. +   *cs++ = ctx->trtt_info.null_tile_val;
  658. +
  659. +   *cs++ = i915_mmio_reg_offset(GEN9_TRTT_INVD_TILE_REG);
  660. +   *cs++ = ctx->trtt_info.invd_tile_val;
  661. +
  662. +   *cs++ = i915_mmio_reg_offset(GEN9_TRTT_VA_MASKDATA);
  663. +   *cs++ = GEN9_TRVA_MASK_VALUE | trva_data_value;
  664. +
  665. +   *cs++ = i915_mmio_reg_offset(GEN9_TRTT_TABLE_CONTROL);
  666. +   *cs++ = GEN9_TRTT_IN_GFX_VA_SPACE | GEN9_TRTT_ENABLE;
  667. +
  668. +   *cs++ = MI_NOOP;
  669. +   intel_ring_advance(req, cs);
  670. +
  671. +   return 0;
  672. +}
  673. +
  674.  /**
  675.   * intel_logical_ring_cleanup() - deallocate the Engine Command Streamer
  676.   * @engine: Engine Command Streamer.
  677. @@ -1915,11 +2000,14 @@ int logical_render_ring_init(struct intel_engine_cs *engine)
  678.         engine->irq_keep_mask |= GT_RENDER_L3_PARITY_ERROR_INTERRUPT;
  679.  
  680.     /* Override some for render ring. */
  681. -   if (INTEL_GEN(dev_priv) >= 9)
  682. +   if (INTEL_GEN(dev_priv) >= 9) {
  683.         engine->init_hw = gen9_init_render_ring;
  684. -   else
  685. +       engine->init_context = gen9_init_rcs_context;
  686. +   } else {
  687.         engine->init_hw = gen8_init_render_ring;
  688. -   engine->init_context = gen8_init_rcs_context;
  689. +       engine->init_context = gen8_init_rcs_context;
  690. +   }
  691. +
  692.     engine->emit_flush = gen8_emit_flush_render;
  693.     engine->emit_breadcrumb = gen8_emit_breadcrumb_render;
  694.     engine->emit_breadcrumb_sz = gen8_emit_breadcrumb_render_sz;
  695. @@ -2235,3 +2323,19 @@ void intel_lr_context_resume(struct drm_i915_private *dev_priv)
  696.         }
  697.     }
  698.  }
  699. +
  700. +int intel_lr_rcs_context_setup_trtt(struct i915_gem_context *ctx)
  701. +{
  702. +   struct intel_engine_cs *engine = ctx->i915->engine[RCS];
  703. +   struct drm_i915_gem_request *req;
  704. +   int ret;
  705. +
  706. +   req = i915_gem_request_alloc(engine, ctx);
  707. +   if (IS_ERR(req))
  708. +       return PTR_ERR(req);
  709. +
  710. +   ret = gen9_emit_trtt_regs(req);
  711. +
  712. +   i915_add_request(req);
  713. +   return ret;
  714. +}
  715. diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
  716. index 52b3a1fd4059..2399320d7f14 100644
  717. --- a/drivers/gpu/drm/i915/intel_lrc.h
  718. +++ b/drivers/gpu/drm/i915/intel_lrc.h
  719. @@ -81,6 +81,7 @@ struct i915_gem_context;
  720.  void intel_lr_context_resume(struct drm_i915_private *dev_priv);
  721.  uint64_t intel_lr_context_descriptor(struct i915_gem_context *ctx,
  722.                      struct intel_engine_cs *engine);
  723. +int intel_lr_rcs_context_setup_trtt(struct i915_gem_context *ctx);
  724.  
  725.  /* Execlists */
  726.  int intel_sanitize_enable_execlists(struct drm_i915_private *dev_priv,
  727. diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
  728. index 18bc0ec618dd..926f9142c96e 100644
  729. --- a/include/uapi/drm/i915_drm.h
  730. +++ b/include/uapi/drm/i915_drm.h
  731. @@ -1309,9 +1309,17 @@ struct drm_i915_gem_context_param {
  732.  #define I915_CONTEXT_PARAM_NO_ERROR_CAPTURE    0x4
  733.  #define I915_CONTEXT_PARAM_BANNABLE    0x5
  734.  #define I915_CONTEXT_PARAM_WATCHDOG    0x6
  735. +#define I915_CONTEXT_PARAM_TRTT        0x7
  736.     __u64 value;
  737.  };
  738.  
  739. +struct drm_i915_gem_context_trtt_param {
  740. +   __u64 segment_base_addr;
  741. +   __u64 l3_table_address;
  742. +   __u32 invd_tile_val;
  743. +   __u32 null_tile_val;
  744. +};
  745. +
  746.  enum drm_i915_oa_format {
  747.     I915_OA_FORMAT_A13 = 1,
  748.     I915_OA_FORMAT_A29,
RAW Paste Data
We use cookies for various purposes including analytics. By continuing to use Pastebin, you agree to our use of cookies as described in the Cookies Policy. OK, I Understand
 
Top