Saupe, Dietmar
Forschungsvorhaben
Organisationseinheiten
Berufsbeschreibung
Nachname
Vorname
Name
Suchergebnisse Publikationen
Technical Report on Visual Quality Assessment for Frame Interpolation
2019-01-16T16:11:39Z, Men, Hui, Lin, Hanhe, Hosu, Vlad, Maurer, Daniel, Bruhn, Andrés, Saupe, Dietmar
Current benchmarks for optical flow algorithms evaluate the estimation quality by comparing their predicted flow field with the ground truth, and additionally may compare interpolated frames, based on these predictions, with the correct frames from the actual image sequences. For the latter comparisons, objective measures such as mean square errors are applied. However, for applications like image interpolation, the expected user's quality of experience cannot be fully deduced from such simple quality measures. Therefore, we conducted a subjective quality assessment study by crowdsourcing for the interpolated images provided in one of the optical flow benchmarks, the Middlebury benchmark. We used paired comparisons with forced choice and reconstructed absolute quality scale values according to Thurstone's model using the classical least squares method. The results give rise to a re-ranking of 141 participating algorithms w.r.t. visual quality of interpolated frames mostly based on optical flow estimation. Our re-ranking result shows the necessity of visual quality assessment as another evaluation metric for optical flow and frame interpolation benchmarks.
KonIQ-10k: Towards an ecologically valid and large-scale IQA database
2018-03-22T17:50:05Z, Lin, Hanhe, Hosu, Vlad, Saupe, Dietmar
The main challenge in applying state-of-the-art deep learning methods to predict image quality in-the-wild is the relatively small size of existing quality scored datasets. The reason for the lack of larger datasets is the massive resources required in generating diverse and publishable content. We present a new systematic and scalable approach to create large-scale, authentic and diverse image datasets for Image Quality Assessment (IQA). We show how we built an IQA database, KonIQ-10k, consisting of 10,073 images, on which we performed very large scale crowdsourcing experiments in order to obtain reliable quality ratings from 1,467 crowd workers (1.2 million ratings). We argue for its ecological validity by analyzing the diversity of the dataset, by comparing it to state-of-the-art IQA databases, and by checking the reliability of our user studies.
Spatiotemporal Feature Combination Model for No-Reference Video Quality Assessment
2018, Men, Hui, Lin, Hanhe, Saupe, Dietmar
One of the main challenges in no-reference video quality assessment is temporal variation in a video. Methods typically were designed and tested on videos with artificial distortions, without considering spatial and temporal variations simultaneously. We propose a no-reference spatiotemporal feature combination model which extracts spatiotemporal information from a video, and tested it on a database with authentic distortions. Comparing with other methods, our model gave satisfying performance for assessing the quality of natural videos.
SUR-Net : Predicting the Satisfied User Ratio Curve for Image Compression with Deep Learning
2019, Fan, Chunling, Lin, Hanhe, Hosu, Vlad, Zhang, Yun, Jiang, Qingshan, Hamzaoui, Raouf, Saupe, Dietmar
The Satisfied User Ratio (SUR) curve for a lossy image compression scheme, e.g., JPEG, characterizes the probability distribution of the Just Noticeable Difference (JND) level, the smallest distortion level that can be perceived by a subject. We propose the first deep learning approach to predict such SUR curves. Instead of the direct approach of regressing the SUR curve itself for a given reference image, our model is trained on pairs of images, original and compressed. Relying on a Siamese Convolutional Neural Network (CNN), feature pooling, a fully connected regression-head, and transfer learning, we achieved a good prediction performance. Experiments on the MCL-JCI dataset showed a mean Bhattacharyya distance between the predicted and the original JND distributions of only 0.072.
Expertise screening in crowdsourcing image quality
2018, Hosu, Vlad, Lin, Hanhe, Saupe, Dietmar
We propose a screening approach to find reliable and effectively expert crowd workers in image quality assessment (IQA). Our method measures the users' ability to identify image degradations by using test questions, together with several relaxed reliability checks. We conduct multiple experiments, obtaining reproducible results with a high agreement between the expertise-screened crowd and the freelance experts of 0.95 Spearman rank order correlation (SROCC), with one restriction on the image type. Our contributions include a reliability screening method for uninformative users, a new type of test questions that rely on our proposed database 1 of pristine and artificially distorted images, a group agreement extrapolation method and an analysis of the crowdsourcing experiments.
Empirical evaluation of no-reference VQA methods on a natural video quality database
2017-05, Men, Hui, Lin, Hanhe, Saupe, Dietmar
No-Reference (NR) Video Quality Assessment (VQA) is a challenging task since it predicts the visual quality of a video sequence without comparison to some original reference video. Several NR-VQA methods have been proposed. However, all of them were designed and tested on databases with artificially distorted videos. Therefore, it remained an open question how well these NR-VQA methods perform for natural videos. We evaluated two popular VQA methods on our newly built natural VQA database KoNViD-1k. In addition, we found that merely combining five simple VQA-related features, i.e., contrast, colorfulness, blurriness, spatial information, and temporal information, already gave a performance about as well as those of the established NR-VQA methods. However, for all methods we found that they are unsatisfying when assessing natural videos (correlation coefficients below 0.6). These findings show that NR-VQA is not yet matured and in need of further substantial improvement.
Visual Quality Assessment for Motion Compensated Frame Interpolation
2019, Men, Hui, Lin, Hanhe, Hosu, Vlad, Maurer, Daniel, Bruhn, Andres, Saupe, Dietmar
Current benchmarks for optical flow algorithms evaluate the estimation quality by comparing their predicted flow field with the ground truth, and additionally may compare interpolated frames, based on these predictions, with the correct frames from the actual image sequences. For the latter comparisons, objective measures such as mean square errors are applied. However, for applications like image interpolation, the expected user's quality of experience cannot be fully deduced from such simple quality measures. Therefore, we conducted a subjective quality assessment study by crowdsourcing for the interpolated images provided in one of the optical flow benchmarks, the Middlebury benchmark. We used paired comparisons with forced choice and reconstructed absolute quality scale values according to Thurstone's model using the classical least squares method. The results give rise to a re-ranking of 141 participating algorithms w.r.t. visual quality of interpolated frames mostly based on optical flow estimation. Our re-ranking result shows the necessity of visual quality assessment as another evaluation metric for optical flow and frame interpolation benchmarks.
Disregarding the Big Picture : Towards Local Image Quality Assessment
2018, Wiedemann, Oliver, Hosu, Vlad, Lin, Hanhe, Saupe, Dietmar
Image quality has been studied almost exclusively as a global image property. It is common practice for IQA databases and metrics to quantify this abstract concept with a single number per image. We propose an approach to blind IQA based on a convolutional neural network (patchnet) that was trained on a novel set of 32,000 individually annotated patches of 64×64 pixel. We use this model to generate spatially small local quality maps of images taken from KonIQ-10k, a large and diverse in-the-wild database of authentically distorted images. We show that our local quality indicator correlates well with global MOS, going beyond the predictive ability of quality related attributes such as sharpness. Averaging of patchnet predictions already outperforms classical approaches to global MOS prediction that were trained to include global image features. We additionally experiment with a generic second-stage aggregation CNN to estimate mean opinion scores. Our latter model performs comparable to the state of the art with a PLCC of 0.81 on KonIQ-10k.
The Konstanz natural video database (KoNViD-1k)
2017, Hosu, Vlad, Hahn, Franz, Jenadeleh, Mohsen, Lin, Hanhe, Men, Hui, Sziranyi, Tamas, Li, Shujun, Saupe, Dietmar
Subjective video quality assessment (VQA) strongly depends on semantics, context, and the types of visual distortions. Currently, all existing VQA databases include only a small num- ber of video sequences with artificial distortions. The development and evaluation of objective quality assessment methods would benefit from having larger datasets of real-world video sequences with corresponding subjective mean opinion scores (MOS), in particular for deep learning purposes. In addition, the training and validation of any VQA method intended to be ‘general purpose’ requires a large dataset of video sequences that are representative of the whole spectrum of available video content and all types of distortions. We report our work on KoNViD-1k, a subjectively annotated VQA database consisting of 1,200 public- domain video sequences, fairly sampled from a large public video dataset, YFCC100m. We present the challenges and choices we have made in creating such a database aimed at ‘in the wild’ authentic distortions, depicting a wide variety of content.