[Feature] Enhancing face id Correlation for Video #5936
Replies: 9 comments 19 replies
-
|
We're considering (1) for both smart search and facial recognition. The advantages you've raised apply to both. Additionally, for smart search, it would mean you could search for a specific scene in a video. Another possible feature could be to do machine learning on-the-fly when pausing a video, showing the detected faces for that frame. This could also be used to add OCR for a frame once that machine learning task has been added. |
Beta Was this translation helpful? Give feedback.
-
|
Would be a nice additition to have facial recognition work on videos 💯 |
Beta Was this translation helpful? Give feedback.
-
|
See also corresponding thread on Discord: |
Beta Was this translation helpful? Give feedback.
-
|
This feature would be amazing on Immich. I am currently searching for pictures of a deceased cherished relative, and it is so much work. Also I know I am missing a lot because I can't watch every single video. |
Beta Was this translation helpful? Give feedback.
-
|
One approach to doing this and IMO may be fairly simple is to use ffmpeg to extract screen shots from the video, and run those through your same systems you have in place for face recognition and smart search. The images could be removed after or saved as a stack of reference images for the video. The tricky part is determining the time interval for grabbing images. You could have this as as system setting changed by the user or a fixed rate that could be changed in the UI video by video. |
Beta Was this translation helpful? Give feedback.
-
|
Any updates on this? |
Beta Was this translation helpful? Give feedback.
-
|
Maybe this is interesting: |
Beta Was this translation helpful? Give feedback.
-
|
I think what is needed for Immich has already been implemented for Jellyfin - it is called Trickplay, and it is used to help with seeking timeline. It creates a file with frames every x seconds, I think such image could be passed into the face detection/recognition. You may want to look at this (and related) commit jellyfin/jellyfin@ca7d1a1 https://deepwiki.com/search/explain-trickplay_863319de-15e3-4cc6-83d4-048c6773bff2 |
Beta Was this translation helpful? Give feedback.
-
|
Any update on this? It's been years |
Beta Was this translation helpful? Give feedback.

Uh oh!
There was an error while loading. Please reload this page.
-
The feature
In my observations, it appears that the correlation with face id is established based on the first frame of the video. While this is a straightforward and easily implementable solution, there are some potential issues.
Notably, if the video doesn't commence with a person, there could be various reasons for this, including the camera being out of focus or the video starting by showcasing the floor (as is common in many family videos).
To address this, two potential solutions come to mind:
Ideally, the implementation would incorporate both solutions.
Platform
Beta Was this translation helpful? Give feedback.
All reactions