This article explores the relations between how objects look (the shapes, colours, and textures) and how they feel to touch (the tactile property).
The authors suggest that people are good at predicting objects’ tactile properties from simply looking at photos of the objects – yet different photos of the same object can influence the tactile ‘prediction’.
— Gil Dekel
By Bei Xiao; Wenyan Bi; Xiaodan Jia; Hanhan Wei; Edward H. Adelson.
Humans can often estimate tactile properties of objects from vision alone. For example, during online shopping, we can often infer material properties of clothing from images and judge how the material would feel against our skin. What visual information is important for tactile perception? Previous studies in material perception have focused on measuring surface appearance, such as gloss and roughness, and using verbal reports of material attributes and categories. However, in real life, predicting tactile properties of an object might not require accurate verbal descriptions of its surface attributes or categories. In this paper, we use tactile perception as ground truth to measure visual material perception. Using fabrics as our stimuli, we measure how observers match what they see (photographs of fabric samples) with what they feel (physical fabric samples). The data shows that color has a significant main effect in that removing color significantly reduces accuracy, especially when the images contain 3-D folds. We also find that images of draped fabrics, which revealed 3-D shape information, achieved better matching accuracy than images with flattened fabrics. The data shows a strong interaction between color and folding conditions on matching accuracy, suggesting that, in 3-D folding conditions, the visual system takes advantage of chromatic gradients to infer tactile properties but not in flattened conditions. Together, using a visual–tactile matching task, we show that humans use folding and color information in matching the visual and tactile properties of fabrics.
In daily life, we can make predictions about the tactile and mechanical properties of objects that have not yet been touched (Adelson, 2001). For example, when we reach to pick up a glass of milk, we have already automatically made predictions about the weight and rigidity of the glass as well as the fluidity of the milk. The facts that glass is rigid and water is fluid are crucial in planning the initial grip and lift force. What allows the visual prediction of tactile and mechanical properties of objects?
Different categories of materials, such as different types of food, wood, plastic, stone, and fabrics, exhibit different visual attributes that are characteristic of the materials (Fleming, 2014). Surface cues, such as color, texture, and reflectance patterns as well as 3-D shape, are often informative about the tactile and mechanical property of an object, such as its mass, stiffness, hardness, and surface friction, as well as functional properties, such as wetness, edibility, thermal conductivity, etc. Until recently, however, the field of material perception has concentrated on the passive perception of visual properties, such as surface gloss, translucency, and roughness, and material categories by measuring only visual responses (Anderson & Kim, 2009; Fleming, Wiebel, & Gegenfurtner, 2013; Giesel & Zaidi, 2013; Kawabe, Maruya, Fleming, & Nishida, 2015; Maloney & Brainard, 2010; Sharan, Rosenholtz, & Adelson, 2014). These studies revealed important image cues that are associated with material perception under a variety of contexts.
However, in reality, humans often use multiple senses to judge material properties of objects. Previous literature has shown that inputs from multiple senses often interact during material perception (Bonneel, Suied, Viaud-Delmon, & Drettakis, 2010; Buckingham, Cant, & Goodale, 2009; Fujisaki, Goda, Motoyoshi, Komatsu, & Nishida, 2014; Fujisaki, Tokita, & Kariya, 2015; Martín, Iseringhausen, Weinmann, & Hullin, 2015; Tiest & Kappers, 2007). The majority of multisensory studies of human material perception have focused on measuring one or certain specific attributes, such as surface roughness (Tiest & Kappers, 2007). A few studies have looked at the dimensionality of haptic and visual perception of material properties and found that the roles of visual modalities and haptic modalities were both overlapping and complementary (Bhushan, Rao, & Lohse, 1997; Hollins, Bensmaïa, Karlof, & Young, 2000; Rao & Lohse, 1993). Baumgartner, Wiebel, and Gegenfurtner (2013) asked observers to categorize and rate several material properties of 84 different material samples. The experiments were done with both visual-alone and haptic-alone conditions. They found that haptic and visual perception of material properties are highly correlated such that the principal component analysis shows that material samples are similarly organized within both sensory modalities. Martín et al. (2015) compared visual and auditory perception of material properties by rating perceptual qualities (pairs of adjectives) using visual and auditory channels separately or together. Their results revealed that auditory cues have strong bias toward tactile qualities of the materials.
Most of these multisensory studies used attribute rating as the main task. Hence, the results depended on observers’ ability of using language to describe material properties. In reality, however, we often directly access an object’s properties by touching without verbally describing its properties (see Klatzky & Lederman, 2010; Lederman & Klatzky, 2009; Tiest, 2010, for reviews). To test whether the bread you see in the market has a good crust but is still soft inside, the best way is to squeeze it by hand. Even when we cannot touch the object (such as during online shopping), visual information can wordlessly convey tactile properties that allow us to predict the tactile properties of an object. For example, when we choose to buy a silk scarf, we look at its surface gloss, color, surface texture, and folds presented in the images to judge its material properties. Previous studies have shown that vision is sufficient to directly recognize tactile properties, such as surface roughness, and can guide the selection of a haptic exploration procedure (Lederman & Klatzky, 1993; Plaisier, Kappers, Tiest, & Ernst, 2010; Tiest & Kappers, 2007). In this study, we aim to use tactile sensation as ground truth to evaluate the success of visual perception of material properties. In addition, we wish to explore what visual information, such as shape and color, can efficiently convey tactile properties.
Figure 1A shows that different photographs of the same fabrics from different online vendors exhibit different tactile and mechanical properties. The image that shows a human hand rubbing the fabric provides a much better sense of the fabric’s mechanical and tactile properties than the other photos.
The present study
In summary, past studies in material perception focused on measuring visual attributes with verbal reports. However, understanding material perception of objects in the real world requires multisensory inputs and natural tasks without verbal report. In this paper, restricting the stimuli to fabrics, we designed a natural task in which observers were asked to match what they see (without touching) to what they feel (without looking). Figure 2 shows the experiment task and apparatus. We manipulated the 3-D folding conditions of the fabrics as well as the color of the photographs and measured how well observers matched the photographs to the physical samples. Our goal was to discover image regularities that contribute to the prediction of tactile properties of fabrics from images.
In Experiment 1, we aimed to investigate whether color information and folds of fabrics in an image affect tactile perception of fabrics. We began by photographing fabrics under various folding conditions so that the photographs conveyed different shape information: 2D_Flat, 3D_Draping, and 3D_Hanging. We then created two color conditions, the original color (red-green-blue) RGB condition and the grayscale condition in which we converted the color images to gray scale. We thus created six experimental conditions: 2D_Flat_RGB, 2D_Flat_Grayscale, 3D_Draping_RGB, 3D_Draping_Grayscale, 3D_Hanging_RGB, 3D_Hanging_Grayscale (Figure 3). In a tactile–visual matching task, we asked observers to arrange the two pieces of fabrics using their hands inside a box (without looking) so that their positions matched the images of the same pair of fabrics displayed on the monitor (Figure 2D). If the images can reveal sufficient tactile properties, then observers will be able to tell which photo is corresponding to which piece of the physical sample they feel.
In Experiment 1, we discovered that both color and folding shape have effects on visual and tactile matching. Our fabric pairs were chosen based on a pilot experiment using real samples in the draping condition to avoid a ceiling effect. However, this preselection may have favored the draping condition because the subset was selected to allow for errors under this condition. Additionally, we used a between-subjects design in Experiment 1 to minimize the carryover effects. It raises the question of whether the differences between the groups of observers also contributed to the results. Finally, we were interested in whether our results would generalize to other types of apparel fabrics.
In Experiment 2, we aimed to solve the above issues by (a) using a different set of fabric samples that had the same size and similar thickness as those in Experiment 1, (b) choosing the fabric pairs from all possible combinations using a tactile similarity rating experiment, and (c) including a control experiment that involved all observers performing the matching task on the same set of fabrics.
Experiment 2 used the same tactile and visual matching task as in Experiment 1. We measured the same experimental conditions, RGB versus grayscale images for draping and flat conditions. Because in Experiment 1 we did not discover significant differences between the 3D_Draping and 3D_Hanging conditions, we only used the 3D_Draping condition in Experiment 2. Thus, we got four conditions in this experiment: 3D_RGB, 3D_Grayscale, 2D_RGB, and 2D_Grayscale.
The current study was inspired by the observation that humans were good at predicting how objects would feel just by looking (such as judging clothing properties during online shopping). Of particular interest is the question of what visual information affects the prediction of tactile properties of objects. Previous work in multisensory material perception primarily measured perception separately for each sensory modality using verbal report to describe material attributes. Here, we used the tactile sensation as ground truth to evaluate the visual perception of the fabric materials by manipulating the photographic conditions of the fabrics.
In both Experiments 1 and 2, we found that observers matched their tactile perception of the materials to visual perception of the materials with lower accuracy if the color information was removed from the images. Furthermore, we discovered that removing color information significantly worsened the matching accuracy of the results that involved the 3D_Folding conditions but not the 2D_Flat conditions. In addition, we observed that images containing fabrics with 3-D folding information significantly improved the matching accuracy if the color information was also preserved. When the color information was removed, the matching accuracy was no longer affected by the folding conditions. Although the same results were obtained using different sets of fabrics and observers, the within-subject control experiment didn’t replicate the main findings of Experiments 1 and 2 (Table 3). We think that this might be due to (a) the reduced set of stimuli—there were only three matching pairs in each of the four conditions—and (b) carryover and ceiling effects: The control experiment was conducted after the main experiment, and observers might have developed an efficient strategy to do this task. The high mean accuracy of the four conditions (all above 95%) confirmed this possibility, indicating that it is the ceiling effect that makes the differences between the four conditions seems to disappear.
Effect of color
Interestingly, both Experiments 1 and 2 found that removing color significantly decreased accuracy for the 3-D folding conditions but had little effect on the 2-D conditions (Figure 5). It is possible that in the 2D_Flat conditions, observers mainly relied on texture information to visually distinguish the fabrics so that color information became irrelevant. In the 3-D folding conditions, by contrast, the texture information was less dominant (due to lower spatial resolution of the 3D_Draping images in comparison to 2D_Flat images) so that color played a significant role. Under this hypothesis, we would expect that color had stronger effects on the fabric pairs that were similar in textures but different in mechanical aspects (e.g., two satins that are different in stiffness). Figure 8B confirmed this hypothesis by showing that color has significant effects when both of the fabrics are within the same categories, such as when both fabrics are shiny (glossy–glossy) and when they are both matte (matte–matte).
Our first question is which are the fabric pairs that resulted in the biggest effects of color on the matching accuracy? Figure 9 shows examples of stimuli pairs that achieved large errors in grayscale images but small errors in RGB images. We see that color improves performance when both fabrics are shiny but with different degrees of shininess. For example, fabric pair 12 showed two pieces of shiny fabric, a ripstop and a satin, that had no error in the RGB conditions but two error counts in the grayscale conditions. This is possible because specular highlights on the folded surface might be easier to detect and separate from the diffuse reflectance in the RGB images than in the grayscale images. More importantly, specular highlights inform the geometry of the folded fabric sample, hence indirectly affecting its display of mechanical properties, such as stiffness. The effects of color on perception of surface gloss has been discussed in several articles, but systematic research is needed (Chadwick & Kentridge, 2015; Hanada, 2012; Leloup, Pointer, Dutré, & Hanselaer, 2012; Nishida, Motoyoshi, & Maruya, 2011).
Even when the fabrics do not have apparent specular highlights, chromaticity variation within the surface caused by mutual reflections within the folds can also provide cues to surface properties. For example, fabric pair 17 in Figure 9 is two pieces of corduroy and linen. Corduroy has tiny fibers that have distinctive reflective properties (there are strong color gradients from shaded areas to bright areas), but this information is reduced when color information is removed. One possibility is that the inter-reflectance between the folds in the color images could be characteristic of material properties and thus provide information about the lighting geometry (Fleming, Holtmann-Rice, & Bulthoff, 2011; Harding, Harris, & Bloj, 2012; Ruppertsberg, Bloj, & Hurlbert, 2008). To further understand the effects of color on the perception of complex materials, one can use computer-rendered stimuli to systematically measure material perception by isolating surface reflection, textures, and lighting parameters.
Color information can also indicate high-level representation of material categories. For example, on average, upholstery fabrics tend to have darker color than jersey shirts. If this was the only information that was removed when we used the grayscale images, we would expect the accuracy for the 2D_Flat conditions to also decrease. But the data shows removing color has little effect on the 2D_Flat conditions. Giesel and Zaidi (2013) found no effect of color on the material classification of images of fabrics. It is possible the stimuli used in their study resembled our stimuli in the 2D_Flat images, in which the fabrics were flat in the image and have few wrinkles and folds. However, we cannot rule out the possible role of high-level color association in the tactile–visual matching task, which is suggested by previous research (Maloney & Brainard, 2010; Olkkonen, Hansen, & Gegenfurtner, 2008).
Effect of the 3-D folds
Our study found that folding condition affects matching accuracy for RGB image conditions. We also found that folding had significant effects on the fabric pairs when both were shiny, and this effect was independent of color condition (Figure 8). The explanation could be that 3-D drape improved the impression of glossiness. Surface glossiness is related to surface smoothness, which is a tactile property. Hence, being able to perceive glossiness leads the observers to feel the smoothness of the fabric. Figure 9 shows examples of fabric pairs that achieved large errors in the 2D_Flat conditions but small errors in the 3D_Folding conditions. Several examples were composed of two glossy samples. For example, fabric pair 11 in Figure 10 was composed of a red ripstop nylon and yellow satin. The satin fabric is much shinier than the ripstop fabric. But this difference is difficult to see in the flat conditions in which there were no specular highlights. Previous findings showed that the visual system used shading cues to estimate reflective properties of the surface and materials, and 3-D shape influenced material perception (Giesel & Zaidi, 2013; Ho, Landy, & Maloney, 2008; Kerrigan & Adams, 2013; Kim, Marlow, & Anderson, 2011, 2012; Marlow, Kim, & Anderson, 2011; Motoyoshi, 2010; Radonjic & Gilchrist, 2013; Vangorp, Laurijssen, & Dutré, 2007; Wijntjes, Doerschner, Kucukoglu, & Pont, 2012). A recent study also shows that presence of specular highlights also increased perceived surface convexity (Adams & Elder, 2014). It is possible that the presence of specular highlights helps recovering the 3-D shape of the fabrics, hence improving material understanding of the mechanical properties, such as softness and rigidity, of the fabrics.
However, the folding condition also affects fabrics that are not glossy, which was revealed by individual examples shown in Figure 10. When two pieces of fabric are similar in their textures, visible draping folds help to infer the difference in mechanical properties, such as stiffness. The effect also goes beyond glossy fabrics. The fabric pair 9 in Figure 10 was two pieces of linen that have different stiffness. The shape of the draping folds revealed that the blue linen was a little stiffer than the gray linen. This effect was difficult to infer in the 2-D flat conditions.
We also discovered that there was an interaction between the effects of color and folding conditions. In the grayscale conditions, the effects of folding on matching accuracy became small. During the visual–tactile matching experiment, multiple cues were present for the observer to use, such as chromatic gradients, specular highlights, contrast between bright and shadow part, and 3-D textures. It was possible that observers weighted these cues differently for different stimuli. When the image was in color, chromaticity gradients and specular highlights might be as important as texture information. However, when color information was removed, observers might choose to only focus on texture information, ignoring the shape-related intensity gradients. This could be the reason why in grayscale images the matching accuracy of 3-D and 2-D conditions were similar. In the future, it would be interesting to isolate these cues and construct a cue-combination model on visual and tactile matching.
Role of 3-D textures
We discovered that both effects of color and folding condition have strong effects on fabric pairs that are similar in texture, such as both being shiny and smooth (Figures 9 and 10). When two fabrics are in different categories (glossy and matte), 3-D texture cues (not only the patterns of the fabrics but also the thread counts, woven patterns, surface relief, etc.) are very important for discrimination of fabrics independent of folding conditions. In our fabric samples, there were many fabric pairs with different 3-D textures, such as corduroy and linen. Observers could use 3-D textures to predict roughness, friction, and undulation of the fabrics. This information was also present in the 2D_Flat conditions. This could also explain why the matching accuracy was high in the current study (around 75% on average for both Experiments 1 and 2). This is consistent with the findings of Giesel and Zaidi (2013), which showed that the visual system could use the 2-D luminance variations that arise from the 3-D textures of the materials to perceive fabric attributes, such as thickness, undulation, and roughness.
Effect of lighting
In this study, we kept the lighting conditions constant across the folding conditions. However, we recognize the potential effect of lighting geometry on the results. It has recently been shown that for image-based illumination (Debevec & Malik, 1997) the geometry of the light field determines the level of perceived gloss (Doerschner, Boyaci, & Maloney, 2010; Olkkonen & Brainard, 2010). The direction of lighting also affects perception of material properties. It has been shown that direction of lighting affects perception of surface roughness (Ho, Serwe, Trommershäuser, Maloney, & Landy, 2009) and translucency (Xiao et al., 2014). It was also shown that the discrimination of 3-D shape is improved when specular highlights are present (Norman, Todd, & Orban, 2004). Even though we do not suspect the primary result of the effect of color and shape would change if we varied the lighting geometry, it is possible that changing lighting geometry would have a similar effect as including 3-D shading cues for visual and tactile matching. Searching for an optimal lighting geometry to improve prediction of tactile properties of materials would be a valuable next step.
The role of tactile exploratory procedure
The current study focused on the effect of visual stimuli on tactile and visual matching. We have not restricted the exploration mode of tactile perception in this study. But in the postexperiment survey we conducted, all observers reported they used one hand for each of the fabrics to feel the fabrics. They also reported using hand movements such as “scratching,” “rubbing,” and “lifting up” as their tactile strategy. In Experiment 2, we required observers to write down which of these three strategies they used to match each pair. Overall, mean percentage of using scratching was 25.7%, rubbing was 88.8%, and lifting up was 21.2% across all conditions. More interestingly, we found that use of a tactile exploratory procedure interacted with color conditions. Removing color information significantly increased the use of lifting up, F(1, 64) = 24.13, p = 0.000, = .17, and scratching, F(1, 64) = 4.24, p < 0.05, = .06, and had no influence on the use of rubbing, F(1, 64) = 0.91, p > 0.10, = .01.
Seminal work by Lederman and Klatzky (1987) showed that human observers are very good at using optimal exploratory procedures (such as lateral motion, contour following, etc.) to acquire different knowledge of objects. Recent studies also show that visual perception of material properties affects planning of lifting and grasping movements of hands (Buckingham et al., 2009). It would be interesting to explore the relationship between visual information and tactile exploration mode in future studies.
The central question in this article concerns how visual information of a soft, deformable object (such as its color, folds, and textures) affects prediction and judgment of its tactile properties. Empirical observations show that people are good at predicting object tactile properties. Using a natural tactile–visual matching task, we found that images of draped fabrics that reveal 3-D shape information allowed for better matching accuracy than images only containing flattened fabrics, which reveal mainly textural information. We also found that color played an important role in predicting tactile properties from images in the 3-D conditions. This suggests that color might be an important visual cue for material perception possibly through interaction with 3-D shape. By analyzing the effects on different categories of fabric, we found that the effects of color and folding condition were both stronger when fabrics were similar in 3-D textures, especially when both fabrics were glossy.
In conclusion, different images of the same object can influence tactile prediction of the object’s material properties. Three-dimensional shape features, such as wrinkles and folds and color gradients across the surface as well as 3-D textures provide useful information to predict tactile and mechanical properties of soft, deformable objects from images.
The authors wish to thank Dr. Hendrkje Nienborg for valuable feedback on the manuscript and the data analysis and Dr. Ruth Rosenholz for very useful discussions for data analysis. We also wish to thank undergraduate students Alex Perepechko and Laura Uribe Tuiran at American University for preparing the stimuli. The work was partially supported by the Google Faculty Research Award to EA in 2012–2013 and the MIT I2 intelligence initiative postdoctoral fellowship to BX in 2013–2014 and American University faculty startup funding (Experiment 2).
Commercial relationships: none.
Corresponding author: Bei Xiao.
Address: Department of Computer Science, American University, Washington, DC, USA.
Published in Journal of Vision, February 2016, Vol.16, 34. doi:10.1167/16.3.34.
Uploaded to this website with permission, on 19 July 2016.
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.