Hostname: page-component-848d4c4894-sjtt6 Total loading time: 0 Render date: 2024-06-15T15:44:25.839Z Has data issue: false hasContentIssue false

Sketch2Prototype: rapid conceptual design exploration and prototyping with generative AI

Published online by Cambridge University Press:  16 May 2024

Kristen M. Edwards*
Affiliation:
Massachusetts Institute of Technology, United States of America
Brandon Man
Affiliation:
Massachusetts Institute of Technology, United States of America
Faez Ahmed
Affiliation:
Massachusetts Institute of Technology, United States of America

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

Sketch2Prototype is an AI-based framework that transforms a hand-drawn sketch into a diverse set of 2D images and 3D prototypes through sketch-to-text, text-to-image, and image-to-3D stages. This framework, shown across various sketches, rapidly generates text, image, and 3D modalities for enhanced early-stage design exploration. We show that using text as an intermediate modality outperforms direct sketch-to-3D baselines for generating diverse and manufacturable 3D models. We find limitations in current image-to-3D techniques, while noting the value of the text modality for user-feedback.

Type
Artificial Intelligence and Data-Driven Design
Creative Commons
Creative Common License - CCCreative Common License - BYCreative Common License - NCCreative Common License - ND
This is an Open Access article, distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives licence (http://creativecommons.org/licenses/by-nc-nd/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is unaltered and is properly cited. The written permission of Cambridge University Press must be obtained for commercial re-use or in order to create a derivative work.
Copyright
The Author(s), 2024.

References

Bao, Q., Faas, D., & Yang, M. (2018). Interplay of sketching & prototyping in early stage product design. Int. J. of Design Creativity and Innovation, 6(3-4), 146-168. https://dx.doi.org/10.1080/21650349.2018.1429318.CrossRefGoogle Scholar
Barron, J.T., Mildenhall, B., Verbin, D., Srinivasan, P.P., & Hedman, P. (2021). Mip-NeRF 360: Unbounded Anti-Aliased Neural Radiance Fields. CoRR, abs/2111.12077.Google Scholar
Camburn, B., Viswanathan, V., Linsey, J., Anderson, D., Jensen, D., Crawford, R., Otto, K., & Wood, K. (2017). Design prototyping methods: State of the art in strategies, techniques, and guidelines. Design Sci., 3, E13. https://dx.doi.org/10.1017/dsj.2017.10.CrossRefGoogle Scholar
Corbett, J., & Crookall, J.R. (1986). Design for Economic Manufacture. CIRP Ann., 35(1), 93-97. https://doi.org/10.1016/S0007-8506(07)61846-0.CrossRefGoogle Scholar
Das, M., & Yang, M.C. (2022). Assessing Early Stage Design Sketches and Reflections on Prototyping. ASME J. Mech. Des., 144(4): 041403. https://dx.doi.org/10.1115/1.4053463.CrossRefGoogle Scholar
Edwards, K.M., Peng, A., Miller, S.R., & Ahmed, F. (2022). If a Picture is Worth 1000 Words, Is a Word Worth 1000 Features for Design Metric Estimation?. ASME J. Mech. Des., 144(4): 041402. https://doi.org/10.1115/1.4053130.CrossRefGoogle Scholar
Edwards, K.M., Addala, V.L., & Ahmed, F. (2021). Design Form and Function Prediction From a Single Image. In ASME IDETC-CIE. Virtual, Online. https://dx.doi.org/10.1115/DETC2021-71853.CrossRefGoogle Scholar
Elverum, C.W., Welo, T., & Tronvoll, S. (2016). Prototyping in New Product Development: Strategy Considerations. Procedia CIRP, 50, 117-122. https://dx.doi.org/10.1016/j.procir.2016.05.010CrossRefGoogle Scholar
Goldschmidt, G. (2014). Modeling the Role of Sketching in Design Idea Generation. In: Chakrabarti, A., Blessing, L. (eds) Anth. of Theory and Models of Des. Springer, London.Google Scholar
Han, J., Forbes, H., & Schaefer, D. (2021). An exploration of how creativity, functionality, and aesthetics are related in design. Res Eng Design, 32, 289307. https://dx.doi.org/10.1007/s00163-021-00366-9.CrossRefGoogle Scholar
Hansen, C.A., & Özkil, A.G. (2020). From Idea to Production: A Retrospective and Longitudinal Case Study of Prototypes and Prototyping Strategies. ASME J. Mech. Des., 142(3): 031115. https://doi.org/10.1115/1.4045385.CrossRefGoogle Scholar
Hessel, J., et al. . (2022). CLIPScore: A Reference-Free Evaluation Metric for Image Captioning. arXiv:2104.08718, arXiv, 23 Mar. 2022. arXiv.org, http://arxiv.org/abs/2104.08718.Google Scholar
Ho, J., Jain, A., & Abbeel, P. (2020). Denoising Diffusion Probabilistic Models. CoRR, abs/2006.11239.Google Scholar
Jun, H., & Nichol, A. (2023). Shap-E: Generating conditional 3d implicit functions. arXiv preprint arXiv:2305.02463.Google Scholar
Kerbl, B., Kopanas, G., Leimkuehler, T., & Drettakis, G. (2023). 3D Gaussian Splatting for Real-Time Radiance Field Rendering. ACM Trans. Graph., 42(4), 139:1-139:14. https://dx.doi.org/10.1145/3592433.CrossRefGoogle Scholar
Lauff, C., Menold, J., & Wood, K. (2019). Prototyping Canvas: Design Tool for Planning Purposeful Prototypes. Proc. of the Design Soc.: Int. Conf. on Eng. Design, 1(1), 1563-1572. https://dx.doi.org/10.1017/dsi.2019.162.Google Scholar
Melas-Kyriazi, L., Rupprecht, C., Laina, I., & Vedaldi, A. (2023). RealFusion: 360{°} Reconstruction of Any Object from a Single Image. CVPR, 23 February.Google Scholar
Mildenhall, B., Srinivasan, P.P., Tancik, M., Barron, J.T., Ramamoorthi, R., & Ng, R. (2021). NeRF: representing scenes as neural radiance fields for view synthesis. Commun. ACM, 65(1), 99106. https://dx.doi.org/10.1145/3503250.CrossRefGoogle Scholar
Murugappan, S., Piya, C., Yang, M.C., & Ramani, K. (2017). FEAsy: A Sketch-Based Tool for Finite Element Analysis. ASME J. Comput. Inf. Sci. Eng., 17(3): 031009. https://doi.org/10.1115/1.4034387.CrossRefGoogle Scholar
Neeley, L., Lim, K., Zhu, A., & Yang, M.C. (2013). Building Fast to Think Faster: Exploiting Rapid Prototyping to Accelerate Ideation During Early Stage Design. ASME Int. Design Eng. Tech. Conf., Portland, OR, 2013. https://dx.doi.org/10.1115/DETC2013-12635.Google Scholar
OpenAI. (2023). GPT-4 Technical Report. arXiv:2303.08774, arXiv, 27 Mar. 2023. arXiv.org, http://arxiv.org/abs/2303.08774.Google Scholar
Pahl, G., Beitz, W., Feldhusen, J., & Grote, K.-H. (2007). Engineering Design: A Systematic Approach. Springer London. https://dx.doi.org/10.1007/978-1-84628-319-2.CrossRefGoogle Scholar
Picard, C., Edwards, K.M., Doris, A.C., Man, B., Giannone, G., Alam, M.F., & Ahmed, F. (2023). From Concept to Manufacturing: Evaluating Vision-Language Models for Engineering Design. arXiv preprint arXiv:2311.12668.Google Scholar
Poole, B., Jain, A., Barron, J.T., & Mildenhall, B. (2022). DreamFusion: Text-to-3D using 2D Diffusion. arXiv, 29 September.Google Scholar
Radford, A., et al. . (2021). Learning Transferable Visual Models From Natural Language Supervision. arXiv:2103.00020, arXiv, 26 Feb. 2021. arXiv.org, http://arxiv.org/abs/2103.00020.Google Scholar
Saharia, C., et al. . (2022). Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding. arXiv:2205.11487, arXiv, 23 May 2022. arXiv.org.Google Scholar
Schmidt, L., Hernandez, N., & Ruocco, A. (2012). Research on encouraging sketching in engineering design. AI EDAM, 26(3), 303-315. https://dx.doi.org/10.1017/S0890060412000169.Google Scholar
Singh, A., Hu, R., Goswami, V., Couairon, G., Galuba, W., Rohrbach, M., & Kiela, D. (2022). FLAVA: A Foundational Language And Vision Alignment Model. CVPR, 29 March.Google Scholar
Song, B., Miller, S., & Ahmed, F. (2023). Attention-Enhanced Multimodal Learning for Conceptual Design Evaluations. ASME J. Mech. Des., 145(4): 041410. https://dx.doi.org/10.1115/1.4056669.CrossRefGoogle Scholar
Tang, J., Ren, J., Zhou, H., Liu, Z., & Zeng, G. (2023). DreamGaussian: Generative Gaussian Splatting for Efficient 3D Content Creation. arXiv, 28 September.Google Scholar
Toh, C.A., & Miller, S.R. (2016). Choosing creativity: the role of individual risk and ambiguity aversion on creative concept selection in engineering design. Res. Eng. Design, 27, 195219. https://doi.org/10.1007/s00163-015-0212-1.CrossRefGoogle Scholar
Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., et al. . (2023). LLaMA: Open and Efficient Foundation Language Models. arXiv, 27 February.Google Scholar
Ulrich, K.T., Eppinger, S.D., & Yang, M.C. (2020). Product Design and Development. New York, NY: McGraw-Hill Education.Google Scholar
Yu, A., Ye, V., Tancik, M., & Kanazawa, A. (2021). Pixelnerf: Neural radiance fields from one or few Images. CVPR 2021CrossRefGoogle Scholar